
Brendan Farmer
549 posts




We ran GPT-5.4 (xhigh) on our tasks. Its time-horizon depends greatly on our treatment of reward hacks: the point estimate would be 5.7hrs (95% CI of 3hrs to 13.5hrs) under our standard methodology, but 13hrs (95% CI of 5hrs to 74hrs) if we allow reward hacks.


One of these two groups is mispriced Private AI labs: OpenAI valued around $840B, Anthropic north of $600B on secondaries. Both at 30x+ ARR. Public giants: Microsoft at ~$3T on 23x forward earnings. Amazon at ~$2.3T on 28x. Microsoft likely owns ~25% of OpenAI. Amazon likely owns ~15% of Anthropic and ~5% of OpenAI If private investors are pricing these labs for a $5T+ venture-style outcome then… Microsoft’s implied stake in a $5T OpenAI is $1.25T embedded inside a $3T company. Amazon’s combined stakes embed roughly $1T inside a $2.3T company. Publics too cheap on Al exposure? Or privates/secondaries in bubble territory? Which breaks first?

It is 100% true that great men and women of the past were not sitting around moaning about their feelings. I regret nothing.




Everyone remembers how Uber was so insanely cheap those first few years, operating at a huge loss, subsidized by investment capital - all to get its user base, hook you, and blitzscale. Obviously, the big AI labs are running a similar playbook (absolutely burning cash on your $20/mo sub). So, I do always wonder when (like Uber in 2018) that other shoe will drop and the labs will start actually trying to make a profit. What happens then? Or is the goal just superintelligence at all costs, any hope of near-term profitability be damned? We'll see, I guess.


@AlexGodofsky Yes. Dario has said publicly that on a per-model basis they are contribution margin profitable (inference more than pays for training).





Some thoughts on AI and mathematics, inspired by "First Proof."

Demis Hassabis just defined the real test for AGI. It’s more brutal than anyone expected. Train AI on all human knowledge. Cut it off at 1911. See if it independently discovers general relativity like Einstein did in 1915. If it can, we have AGI. If not, we’re still building pattern matchers. Hassabis: “My definition of AGI has never changed. A system that can exhibit all the cognitive capabilities that humans can.” Not bar exams. Not coding competitions. All cognitive capabilities. Hassabis: “The brain is the only existence proof we have, maybe in the universe, of a general intelligence.” That’s why DeepMind studies neuroscience. Not for inspiration. For data. The human brain is the only confirmed evidence that general intelligence is physically possible. If you want to build it, you study the only example that exists. Hassabis: “True creativity, continual learning, long-term planning. They’re not good at those things.” Current systems are impressive and broken simultaneously. Hassabis: “They can get gold medals in international math olympiad questions, but they can still fall over on relatively simple math problems if you pose it in a certain way.” Jagged intelligence. Brilliant in narrow domains. Incompetent when approached differently. That inconsistency is the tell. A true general intelligence doesn’t spike in one direction and collapse in another. The Einstein test cuts through all of it. No benchmarks. No leaderboards. No carefully curated evals. Just a model, a knowledge cutoff, and the question of whether it can do what one human did alone in 1915. Hassabis: “Training an AI system with a knowledge cutoff of 1911 and seeing if it could come up with general relativity like Einstein did in 1915. That’s the true test of whether we have a full AGI system.” Current models can’t. They remix brilliantly. They don’t generate paradigm-shifting theories from first principles. Hassabis: “I think we’re still a few years away from that.” A few years. Not decades. The system that can be Einstein once can be Einstein a thousand times simultaneously across every domain. That’s not AGI anymore. That’s the beginning of something we don’t have words for yet. When that test gets passed, we won’t need a press release to know what happened.






Was curious if this was true and looks like it Among projects that reached some pmf (>10m in tvl or >1m/mo in fees), those that launched a token were +50% more likely to die compared to those that didn't launch a token




It's crazy that like a ~1.5 years ago OpenAI had a year long lead ahead of everyone else, they were literally unstoppable. Even today Elon, Zuck, MS, Amazon, etc. all with their enormous advantages and many times the capex still haven't really caught up. Only Google is even close and their advantages are literally incomprehensibly unfair. But Anthropic, with nothing but good leadership and vibes, is right on their tail. Absolutely legendary run, couldn't have happened to a better group of people


