Max Kirkby

244 posts

Max Kirkby banner
Max Kirkby

Max Kirkby

@maxkirkby

training models @baseten. PhD'ing @OxNeuro @rhodes_trust. hierarchical plans and continual learning

Oxford/SF Katılım Şubat 2021
1.2K Takip Edilen930 Takipçiler
Max Kirkby
Max Kirkby@maxkirkby·
"There are even LLMs submitting proofs for Fermat's last theorem". My brother you are having this conversation in Oxford, Andrew Wiles lives within a 2 mile radius.
English
0
0
2
168
Daniel Tan
Daniel Tan@DanielCHTan97·
Cool research! Tl;dr consider trying to prompt-distill a constitution into a model. What turns out to matter is (i) on-policy learning (sample from the student, don't use off-policy demonstrations) and (ii) dense feedback (on-policy distillation, not RL)
Max Kirkby@maxkirkby

x.com/i/article/2032…

English
2
5
40
9.4K
Max Kirkby
Max Kirkby@maxkirkby·
@mj_bilodeau My intuition is that OPSD should stabilise constitutional behaviour, especially with eg an EMA teacher. I do feel like one neat next test is repeated reruns on new diverse datasets to see which principles might upweight/downweight/drift
English
1
0
5
687
Mike Bilodeau
Mike Bilodeau@mj_bilodeau·
@maxkirkby @maxkirkby any thoughts or intuition on whether the on-policy dense methods make constitutional behavior more stable or more prone to drift over repeated post-training runs?
English
1
0
7
860
Paras Stefanopoulos
Paras Stefanopoulos@stefanopopoulos·
New spread trade just dropped: - Sell linear attention and DSA - Buy KV-cache compaction
English
0
0
4
125
Max Kirkby retweetledi
Amir Haghighat
Amir Haghighat@amiruci·
You’ve used language models, image models, video models, and voice models. Now it’s time for world models, thanks to World Labs.
English
34
33
204
836.3K
Max Kirkby
Max Kirkby@maxkirkby·
There’s a funny latent divide between configuring a training run and the person next to you writing an essay on whether AI should be allowed to exist.
English
1
0
13
923
Max Kirkby retweetledi
Baseten
Baseten@baseten·
Continuing this week with a case study ☕️ How did @sullyai return 30M+ clinical minutes to doctors? By ditching closed-source models for a high-performance open-source stack on Baseten. Like many companies, Sully faced inference challenges as they scaled, with ballooning proprietary model costs and unpredictable latency. This was especially critical in Sully's case: in a live clinical setting, a 70-second wait is an eternity. To solve this challenge, we worked together to move to open-source models like GPT OSS 120b. With the Baseten inference stack, Sully was live on NVIDIA HGX B200s just 2 days after the model's release. The results: • 90% reduction in inference costs • 65% reduction in median latency • 21x Return on Agent Spend (ROAS) Check out the full story: baseten.co/resources/cust…
Baseten tweet media
English
2
3
29
4K
Max Kirkby retweetledi
Baseten
Baseten@baseten·
LLMs are amnesiacs. Once context fills up, they forget everything. To fight this means grappling with a core question: how do you update a neural network without breaking what it already knows? In this piece, @oneill_c and @part_harry_ argue that continual learning is inseparable from specialization. While there are various ideas to allow generalist models to learn everything without forgetting anything, these ideas are fundamentally in tension with continual learning in general. What comes after monolith models? A Cambrian explosion of specialists. Read more here: baseten.co/resources/rese…
Baseten tweet media
English
8
11
80
7.5K