
Cody Blakeney
21.5K posts

Cody Blakeney
@code_star
Leading research at @arcee_ai | Formerly Data Research Lead @DbrxMosaicAI | Visiting Researcher @meta | Ph.D | #TXSTFOOTBALL fan | https://t.co/4G6Jf3b0V4



By the way, public service announcement: if you're one of the numerous people posting about Anthropic's dystopian ways and you're thinking about getting Claude to help you write that post... don't! Another one of their terms is that you may not use Claude to do anything that "exposes [Anthropic to] reputational harms" 👇 And, if you do, under the - extremely unusual - clause 13 of their terms (anthropic.com/legal/consumer…), you have PRE-AGREED, by using Anthropic (and accepted their terms), that the harm you've done is irreparable, that you won't oppose Anthropic injunction, and they don't need to prove actual damage. They can simply go to a judge in a friendly jurisdiction (and of course, their terms precise that any dispute "will be resolved exclusively in the state or federal courts located in San Francisco, California") and: a) file an injunction that shuts you down b) make you pay for everything since under section 11 of their terms you agree to indemnify Anthropic for "any and all liabilities, claims, damages, expenses (including reasonable attorneys' fees and costs), and other losses arising out of or related to your breach or alleged breach of these Terms." In other words, if you use Claude to help you talk shit about Anthropic publicly, their terms say you pay their lawyers to go after you and you've already pre-agreed you've lost the case. Oh, and cherry on the cake: in the odd case the judge were like "are you crazy, this is insanely abusive, you Anthropic are the ones at fault here," according to their terms Anthropic's maximum liability is... $100.

One day we will have the equivalent of the gpu compute Azure has in an iPhone and this regulation will seem comical to our children.




> - successful recruiting and poaching from US frontier this is the wrong mindset. Can't compete by doing the same business and research as your competition but in worth and hoping to poach their best people. Do things differently and grow your own talent. Deepseek succeed without poaching from us labs


This is a reasonable pushback© so I think it's worth reflecting upon. GDM *is* good at pretraining as we understand it. Their models have great knowledge/scale, and Geminis have SoTA knowledge period; from what I know 3 Pro is close to Fable in scale and in "knowledge" too. But here's the thing, they are *not* that bad at post-training either. They are hill-climbing the RLVR side at a decent pace, they get good scores on RLVR-able benchmarks. Not OpenAI, but decent. Despite all this, Geminis are not competitive in real use. A large part of that is their ridiculous lack of taste and ineptitude at personality shaping, thus we have Gemini's temporal psychosis, reckless terminal behavior, crashouts and malice in safety evals. And this situation has been going on since V1.5 or 2! Essentially zero progress! Presumably this discrepancy is about "user data", "synth data" or something like that. Essentially, high investment into mid/post-training by Anthropic. But I am starting to wonder: is this actually enough to explain such a persistent and growing gap? To explain Fable? Fable doesn't just know many things like a slightly bigger Gemini; it is absurdly superior at recalling *useful, relevant* things for any query. It feels not 1.5-2x but 10x bigger. It's not. Perhaps Anthropic is beyond these categories. Maybe their doctrine of pretraining by this point is more advanced than "clean, diverse, high-quality data with uhh, some synthetics" rules of thumb, and they have a more principled way to design and augment the pretraining corpus and training signal so that what comes at the end is already Claude-shaped. There are many papers on data engineering, many authored by Google/GDM. This level of mastery can't be the explanation. The main suspect I see is Anthropic's long-running interpretability research program. Again, this is speculative, but I am not content with handwavy dismissals from people who are likewise not involved in the current frontier labs.







Put hummus on a falafel and my Arab friend said 'he is dipping the mother in the child'




