Nathan Myers
69 posts

Nathan Myers
@holoday_
princeton '28 | professional button pusher
Katılım Kasım 2020
60 Takip Edilen37 Takipçiler
Nathan Myers retweetledi

if you do an infinite series of 5 minute tasks which are not priority you eat at all the time to do the most important ones which take longer
Garry Tan@garrytan
New item in my SOUL md tonight
English

@holoday_ The baselines we use are wider than that (>4 cm), but you can always change the code to generate your own. You should definitely check out @_ilya_c's very great work on this (though they consider the unsupervised setting).
arxiv.org/abs/2212.12324
English
Nathan Myers retweetledi

TIL Steve Jobs decided to use glass for the first iPhone *after* announcing it and noticing his plastic screen was scratched.
He convinced @Corning to resurrect an invention they made in the 1960’s, convert a factory, and produce millions of screens in a few months.


English
Nathan Myers retweetledi

A non-trivial share of Anthropic’s gains on Opus and Sonnet likely came from RL Env partners of Anthropic.
Anthropic is single largest buyer across both coding and computer use environments (among labs)
They are spending in size of tens of millions annually on RL environments (across vendors) and as far as the need of good computer use / long horizon tasks is rising, 100s of millions to “specific” vendors will be a norm.
English
Nathan Myers retweetledi

Three weeks ago there were rumors that one of the labs had completed its largest ever successful training run, and that the model that emerged from it performed far above both internal expectations and what people assumed the scaling laws would predict. At the time these were only rumors, and no lab was attached to them. But in light of what we now know about Mythos, they look more credible, and the lab was probably Anthropic.
Around the same time there were also rumors that one of the frontier labs had made an architectural breakthrough. If you are in enough group chats, you hear claims like this constantly, and most turn out to be nothing. But if Anthropic found that training above a certain scale, or in a certain way at that scale, produces capabilities that sit far above the prior trendline, then that is an architectural breakthrough.
I think the leaked blog post was real, but still a draft. Mythos and Capybara were both candidate names for the new tier, though Mythos may now have enough mindshare that they end up keeping it. The specific rumor in early March was that the run produced a model roughly twice as performant as expected. That remains unconfirmed. What is confirmed is that Anthropic told Fortune the new model is a 'step change,' a sudden 2x would certainly fit the definition.
We will find out in April how much of this is true. My own view is that the broad shape of this is correct even if some of the numbers are wrong. And if it is substantially accurate, then it also casts OpenAI's recent restructuring in a new light. If very large training runs are about to become essential to staying in the game, then a lot of their recent decisions, like dropping Sora, make even more sense strategically.
For the public, this would mean the best models in the world are about to become much more expensive to serve, and therefore much more expensive to use. That will put pressure on rate limits, pricing, and subscription plans that are already subsidized to some unknown degree. Instead of becoming too cheap to meter, frontier intelligence may be about to become too expensive for most of humanity to afford.
Second-order effects; compute, memory, and energy are about to become much more important than they already are. In the blog they describe the new model as not just an improvement, but having 'dramatically higher scores' than Opus 4.6 in coding and reasoning, and as being 'far ahead' of any other current models. If this is the new reality, then scale is about to become king in a whole new way. It would also mean, as usual, that Jensen wins again.
English

@RhysSullivan i bet it has to feel good asf to be a service written in rust and not have a memory leak at all
English
Nathan Myers retweetledi








