michael abbott

3.2K posts

michael abbott

michael abbott

@mabb0tt

building

here Katılım Mart 2009
1.8K Takip Edilen22.9K Takipçiler
michael abbott retweetledi
Will Held
Will Held@WilliamBarrHeld·
Our 1e23 "Delphi" (~25B param model trained for ~600B tokens) run for Marin has entered its learning rate decay phase. Lots of spikes at this scale, very scary! Despite that, the run is looking on track to be close to our pre-registered scaling laws predictions. Stay tuned...
Will Held tweet mediaWill Held tweet media
Percy Liang@percyliang

In Marin, we are trying to get really good at scaling laws. We have trained models up to 1e22 FLOPs and have made a prediction of the loss at 1e23 FLOPs, which @WilliamBarrHeld is running. This prediction is preregistered on GitHub, so we'll see in a few days how accurate our prediction was. What we want is not just a single model but a training recipe that scales reliably.

English
6
10
111
30.5K
michael abbott retweetledi
Will Held
Will Held@WilliamBarrHeld·
Scaling laws are "just" regressions. But a biased fitting method can quietly misallocate millions of $ of compute at frontier scales. My coworker Eric Czech dug into a bias in parabolic IsoFLOP fits used by Meta, DeepSeek, Microsoft, Waymo, et al. for their scaling laws🧵
Will Held tweet media
English
2
27
131
32.2K
michael abbott retweetledi
Percy Liang
Percy Liang@percyliang·
If you use Pythia and like it, we're making an updated version. Tell us what you want. Here's a question for y'all: would you rather have a scaling suite trained on Nemotron-CC (very high quality, some distilled) or CommonPile (public domain, permissively licensed, more crunchy)?
Will Held@WilliamBarrHeld

2026 aesthetic: stable scaling runs

English
13
16
232
34.3K
michael abbott retweetledi
OpenRouter
OpenRouter@OpenRouter·
@AnjneyMidha @MaikaThoughts @alexatallah @cclark One finding: we observe a Cinderella "Glass Slipper" effect for new models. Early users a new LLM either churn quickly or become part of a foundational cohort, with much higher retention than others. They are early adopters who can "lead" the rest of the market (more details 👇)
English
1
3
33
7.3K
michael abbott retweetledi
michael abbott retweetledi
Will Held
Will Held@WilliamBarrHeld·
After some trials and tribulations, we trained a 32B parameter base model and it's looking pretty good! Looking forward to working with the community on post-training and continuing to develop in the open!!!
Percy Liang@percyliang

⛵Marin 32B Base (mantis) is done training! It is the best open-source base model (beating OLMo 2 32B Base) and it’s even close to the best comparably-sized open-weight base models, Gemma 3 27B PT and Qwen 2.5 32B Base. Ranking across 19 benchmarks:

English
1
4
31
6.9K
michael abbott retweetledi
Percy Liang
Percy Liang@percyliang·
⛵Marin 32B Base (mantis) is done training! It is the best open-source base model (beating OLMo 2 32B Base) and it’s even close to the best comparably-sized open-weight base models, Gemma 3 27B PT and Qwen 2.5 32B Base. Ranking across 19 benchmarks:
Percy Liang tweet media
English
20
87
599
126.8K
michael abbott retweetledi
Everest Today
Everest Today@EverestToday·
Born on this day, September 17, 1944 — Reinhold Messner, THE GREATEST! The first person ever to climb all fourteen eight-thousanders without using supplementary oxygen — and still here to share his legendary story. Wishing you a very happy birthday, Reinhold! 🙏 Photo © Roberto Carnevali.
Everest Today tweet media
English
19
78
647
21.7K
michael abbott retweetledi
Adam Grant
Adam Grant@AdamMGrant·
Our personalities describe us, but they don't define us. Our most important qualities are our values. Personality traits are the tendencies we have. They reflect our nature and nurture. Values are the principles we choose. They reveal our commitment and character.
English
66
383
1.8K
96.2K