Max Kirkby

244 posts

Max Kirkby

@maxkirkby

training models @baseten. PhD'ing @OxNeuro @rhodes_trust. hierarchical plans and continual learning

Oxford/SF Katılım Şubat 2021

1.2K Takip Edilen930 Takipçiler

Max Kirkby retweetledi

AT@AliesTaha·1d

x.com/i/article/2037…

ZXX

608

67.8K

Max Kirkby@maxkirkby·2d

"There are even LLMs submitting proofs for Fermat's last theorem". My brother you are having this conversation in Oxford, Andrew Wiles lives within a 2 mile radius.

English

168

Max Kirkby@maxkirkby·2d

this is genuinely how it feels to work on the (Australian) post-training team at @baseten

Ben Lang@benln

Best career hack is to make sure you’re the person in the room who's always having fun.

English

900

Max Kirkby@maxkirkby·3d

As a researcher it's pretty cool to work hand in hand with this kind of engineering talent. Fewer bugs wins, and knowing every layer of every stack is pretty critical in making that happen.

Paras Stefanopoulos@stefanopopoulos

x.com/i/article/2035…

English

12.1K

Max Kirkby retweetledi

Thariq@trq212·4d

@matt_slotnick the secret is friendship

English

755

28.3K

Max Kirkby@maxkirkby·19 Mar

@rapprach Speed and perfmogging

English

Rachel Rapp@rapprach·19 Mar

x.com/i/article/2034…

ZXX

8.7K

Max Kirkby@maxkirkby·14 Mar

@DanielCHTan97 Thank you, just the start of a bigger agenda :)

English

Daniel Tan@DanielCHTan97·14 Mar

Cool research! Tl;dr consider trying to prompt-distill a constitution into a model. What turns out to matter is (i) on-policy learning (sample from the student, don't use off-policy demonstrations) and (ii) dense feedback (on-policy distillation, not RL)

Max Kirkby@maxkirkby

x.com/i/article/2032…

English

9.4K

Max Kirkby@maxkirkby·14 Mar

@mj_bilodeau My intuition is that OPSD should stabilise constitutional behaviour, especially with eg an EMA teacher. I do feel like one neat next test is repeated reruns on new diverse datasets to see which principles might upweight/downweight/drift

English

687

Mike Bilodeau@mj_bilodeau·13 Mar

@maxkirkby @maxkirkby any thoughts or intuition on whether the on-policy dense methods make constitutional behavior more stable or more prone to drift over repeated post-training runs?

English

860

Max Kirkby@maxkirkby·13 Mar

x.com/i/article/2032…

ZXX

146

62K

Max Kirkby@maxkirkby·10 Mar

@stefanopopoulos This one worked out well

English

Paras Stefanopoulos@stefanopopoulos·10 Mar

New spread trade just dropped: - Sell linear attention and DSA - Buy KV-cache compaction

English

125

Max Kirkby@maxkirkby·26 Şub

@amiruci Everything is world

English

Max Kirkby retweetledi

Amir Haghighat@amiruci·25 Şub

You’ve used language models, image models, video models, and voice models. Now it’s time for world models, thanks to World Labs.

English

204

836.3K

Max Kirkby@maxkirkby·23 Şub

@philipkiely Time to get back into reading books 💚

English

233

Philip Kiely@philipkiely·23 Şub

Inference Engineering launches today. baseten.com/inference-engi…

English

187

216

2.2K

1.3M

Max Kirkby@maxkirkby·22 Şub

There’s a funny latent divide between configuring a training run and the person next to you writing an essay on whether AI should be allowed to exist.

English

923

Max Kirkby retweetledi

Charlie O'Neill@oneill_c·17 Şub

Come join @baseten where you get to research and be part of an index on AI to expose yourself to all of it

Amy Tam@amytam01

x.com/i/article/2023…

English

3.5K

Max Kirkby@maxkirkby·14 Şub

This was fun. We replicated the results of black box on-policy distillation on the @baseten stack, underscoring the potential of different architectures for post-training!

Baseten@baseten

We replicated Microsoft Research's Generative Adversarial Distillation (GAD) to distill Qwen3-4B from GPT-5.2. Standard black-box distillation teaches a student to copy teacher outputs, but at inference the student generates from its own prefixes, small errors compound and it drifts off the expert distribution. GAD reframes this as an on-policy distillation problem, training a co-evolving discriminator that provides adaptive reward signals on the student's own generations. Exploring methods like this are how our post-training team surfaces new training patterns. Read here: baseten.co/resources/rese…

English

414

Max Kirkby retweetledi

Baseten@baseten·10 Şub

Continuing this week with a case study ☕️ How did @sullyai return 30M+ clinical minutes to doctors? By ditching closed-source models for a high-performance open-source stack on Baseten. Like many companies, Sully faced inference challenges as they scaled, with ballooning proprietary model costs and unpredictable latency. This was especially critical in Sully's case: in a live clinical setting, a 70-second wait is an eternity. To solve this challenge, we worked together to move to open-source models like GPT OSS 120b. With the Baseten inference stack, Sully was live on NVIDIA HGX B200s just 2 days after the model's release. The results: • 90% reduction in inference costs • 65% reduction in median latency • 21x Return on Agent Spend (ROAS) Check out the full story: baseten.co/resources/cust…

English

Max Kirkby retweetledi

Oxford Mathematics@OxUniMaths·6 Şub

Mathematics and mathematicians get everywhere. Like, everywhere. Andrew Wiles @OxUniMaths sampled by rapper J Cole. Fermat's latest theorem.

J. Cole@JColeNC

MIDNIGHT

English

1.1K

67.5K

Max Kirkby retweetledi

Baseten@baseten·6 Şub

LLMs are amnesiacs. Once context fills up, they forget everything. To fight this means grappling with a core question: how do you update a neural network without breaking what it already knows? In this piece, @oneill_c and @part_harry_ argue that continual learning is inseparable from specialization. While there are various ideas to allow generalist models to learn everything without forgetting anything, these ideas are fundamentally in tension with continual learning in general. What comes after monolith models? A Cambrian explosion of specialists. Read more here: baseten.co/resources/rese…

English

7.5K

Max Kirkby@maxkirkby·27 Oca

He writes well, doesn't he

Charlie O'Neill@oneill_c

x.com/i/article/2015…

English

188

Keşfet

@baseten @matt_slotnick @rapprach @DanielCHTan97 @mj_bilodeau @stefanopopoulos @amiruci @philipkiely