Luke Bailey

129 posts

Luke Bailey

@LukeBailey181

CS PhD student @Stanford. Former CS and Math undergraduate @Harvard.

Katılım Temmuz 2023

325 Takip Edilen442 Takipçiler

Sabitlenmiş Tweet

Luke Bailey@LukeBailey181·13 Ara

Can interpretability help defend LLMs? We find we can reshape activations while preserving a model’s behavior. This lets us attack latent-space defenses, from SAEs and probes to Circuit Breakers. We can attack so precisely that we make a harmfulness probe output this QR code. 🧵

GIF

English

372

58.5K

Luke Bailey retweetledi

Samuel Marks@saprmarks·2d

A very nice rundown of highlights from AI safety research in 2025 by @FabienDRoger lesswrong.com/posts/nAsMfmxD…

English

7.2K

Luke Bailey@LukeBailey181·4d

Awesome paper! For a nice morning reading, try arxiv.org/pdf/2509.14786 (the previous paper by Konwoo and Suhas) then this!

Konwoo Kim@konwookim

for data-constrained pre-training, synth data isn’t just benchmaxxing, it lowers loss on the real data distribution as we generate more tokens for even better scaling, treat synth gens as forming one long 𝗺𝗲𝗴𝗮𝗱𝗼𝗰: 1.8x data efficiency with larger gains under more compute

English

3.6K

Luke Bailey retweetledi

Tengyu Ma@tengyuma·15 Mar

Maybe the hardest job in 10 years will be to train humans to be super-AI rather than train AI to be super-human. Academia💪!! But .., one needs to believe this task is feasible, and this job itself is not replaceable by AI ... 😂

English

183

23.6K

Luke Bailey retweetledi

Alex Serrano@sertealex·10 Mar

What if a model could strategically misbehave rarely enough that you'd never catch it during testing? LLMs struggle with calibration in many contexts. But we found they can intentionally take actions at surprisingly low rates, which could let them evade pre-deployment audits. 🧵

English

Luke Bailey retweetledi

Suhas Kotha@kothasuhas·6 Mar

to improve fine-tuning data efficiency, replay generic pre-training data not only does this reduce forgetting, it actually improves performance on the fine-tuning domain! especially when fine-tuning data is scarce in pre-training (w/ @percyliang)

English

498

70.6K

Luke Bailey retweetledi

Tanishq Kumar@tanishqkumar07·4 Mar

Fast inference excites me because one AI workload I care a lot about is extremely long-horizon reasoning. Imagine a data center of B200s dedicated entirely to running a model thinking for billions to tokens to prove P vs NP. There, halving latency means thinking twice as hard!

English

174

20.5K

Luke Bailey@LukeBailey181·4 Mar

My roommate cooks

Tanishq Kumar@tanishqkumar07

I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.

English

4.1K

Luke Bailey retweetledi

Cas (Stephen Casper)@StephenLCasper·20 Şub

SOTA agent interest, papers, and products are all accelerating.

English

191

Luke Bailey@LukeBailey181·20 Şub

Excited that we are releasing this! The report is much more thorough than our 2024 version, mainly because there are far more AI agents now, and they are being used all over the place.

Cas (Stephen Casper)@StephenLCasper

🚨The 2025 AI Agent Index is out! 🚨 Amidst recent buzz over 🦀 and @NIST’s new agent initiative, we find: - Selective reporting – esp. on safety - Almost all agents backend just 3 model families - Many agents don’t ID themselves as bots online - Big US/China gaps - And more…

English

334

Luke Bailey retweetledi

Tengyu Ma@tengyuma·18 Şub

I’ve worked with so many talented researchers who have magical intuitions for tuning hyperparameters. Hopefully, one day, models trained on millions of runs from base models that understand DL theory can do the same :). A small sign of life below.

Huaqing Zhang@zhqwqwq

🚀Introducing our new work: Configuration-to-Performance Scaling Law with Neural Ansatz. A language model trained on large-scale pretraining logs can accurately predict how training configurations influence pretraining performance and generalize to runs with 10x more compute.

English

136

18.8K

Luke Bailey retweetledi

Huaqing Zhang@zhqwqwq·18 Şub

English

162

45.4K

Luke Bailey retweetledi

Jakub Pachocki@merettm·14 Şub

Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The problems require expertise in their respective domains and are not easy to verify; based on feedback from experts, we believe at least six solutions (2, 4, 5, 6, 9, 10) have a high chance of being correct, and some further ones look promising. We will only publish the solution attempts after midnight (PT), per the authors' guidance - the sha256 hash of the PDF is d74f090af16fc8a19debf4c1fec11c0975be7d612bd5ae43c24ca939cd272b1a . This was a side-sprint executed in a week mostly by querying one of the models we're currently training; as such, the methodology we employed leaves a lot to be desired. We didn't provide proof ideas or mathematical suggestions to the model during this evaluation; for some solutions, we asked the model to expand upon some proofs, per expert feedback. We also manually facilitated a back-and-forth between this model and ChatGPT for verification, formatting and style. For some problems, we present the best of a few attempts according to human judgement. We are looking forward to more controlled evaluations in the next round! 1stproof.org #1stProof

English

242

357

2.8K

2.5M

Luke Bailey retweetledi

Perry Dong@perryadong·10 Şub

Reinforcement learning doesn't scale like supervised learning—yet We introduce Transformer Q-Learning (TQL): a method that unlocks scaling of transformer-based value functions in RL We show that value-based RL can also achieve performance gains through scale (1/7)

English

264

52.9K

Luke Bailey retweetledi

Tengyu Ma@tengyuma·5 Şub

Long CoTs can be really slow—ChatGPT pro can take >= 30 minutes! We design and train Divide-and-Conquer-CoT with RL, where reasoning models identify distinct parallelizable sub-tasks. Much better latency vs accuracy tradeoff on math benchmarks. arxiv.org/abs/2601.23027

English

7.5K

Luke Bailey retweetledi

Axiom@axiommathai·5 Şub

1/ AxiomProver has solved Fel’s open conjecture on syzygies of numerical semigroups, autonomously generating a formal proof in Lean with zero human guidance. This is the first time an AI system has settled an unsolved research problem in theory-building math and self verifies.

English

449

2.4K

Luke Bailey retweetledi

Yoshua Bengio@Yoshua_Bengio·3 Şub

Today we’re releasing the International AI Safety Report 2026: the most comprehensive evidence-based assessment of AI capabilities, emerging risks, and safety measures to date. 🧵 (1/17)

English

373

1.1K

417.2K

Luke Bailey retweetledi

Zitong Yang@ZitongYang0·23 Oca

We believe AI research is a special research area AI itself can deliver significant progress. Ronald Fisher laid out the foundation for scientific methods: generating hypothesis (research ideas) and see if you can falsify the hypothesis (execution). Therefore, execution and experiments forms the basis of scientific progress. For mathematics, execution is somewhat special, the chain-of-thoughts of the AI models implicitly carries this through, which explains many progresses in math we are seeing. For AI research, execution is purely in code, which AI is extremely at. Idea generation is in natural language, which AI can also do, although not calibrated right now. So, the natural design is to hook up "AI AI idea generator" and "AI AI experiment executor" together end-to-end. This is the future paradigm we propose. We studied two realistic environments: nanoGPT speed run initiated by @kellerjordan0 @karpathy, and GRPO math reasoning homework built by Stanford CS336 @stanfordnlp. These two "research environments" cover the salient and realistic research topics: LLM pretraining and LLM posttraining. For GRPO post-training, the algorithm discovered by AI (see details in the paper!) outperforms the highest scoring student solution (github.com/stanford-cs336…). For nanoGPT, human expert are too good to compete yet, but AI does reduce the time-to-loss=3.28 from 35min to 19min (as a reference, human expert is 2min..) The advantage of AI over human is very simple: AI is tireless. Over the past few month, our wandb log tells us that our AI researcher tries more than 50K research ideas and iteratively learn from the logs of previous experiments. Humans are, of course, more insightful, but simply can't try so many research ideas. Unfortunately, we didn't close the loop where we apply the posttraining algorithm discovered by AI to obtain a stronger model and deploy that model to improve itself even further. This should just be the beginning of execution-grounded automated AI research. Self-improvement (AI inventing better algorithms to train itself) can be an objective for AI on its own. This is an intrinsically motivated goal for AI. This is my last project from Stanford. Deeply grateful to @ChengleiSi for pushing it together and to @saurabhsgupta for funding us.

CLS@ChengleiSi

Can LLMs automate frontier LLM research, like pre-training and post-training? In our new paper, LLMs found post-training methods that beat GRPO (69.4% vs 48.0%), and pre-training recipes faster than nanoGPT (19.7 minutes vs 35.9 minutes). 1/

English

12.5K

Luke Bailey retweetledi

CLS@ChengleiSi·23 Oca

English

141

574

105.4K

Luke Bailey@LukeBailey181·22 Oca

Awesome paper. I predict AlphaEvolve style "improve prior solution" algorithms will all start using TTT like this.

Mert Yuksekgonul@mertyuksekgonul

How to get AI to make discoveries on open scientific problems? Most methods just improve the prompt with more attempts. But the AI itself doesn't improve. With test-time training, AI can continue to learn on the problem it’s trying to solve: test-time-training.github.io/discover.pdf

English

1.6K

Luke Bailey retweetledi

Kaiyue Wen@wen_kaiyue·21 Oca

(1/n) Introducing Hyperball — an optimizer wrapper that keeps weight & update norm constant and lets you control the effective (angular) step size directly. Result: sustained speedups across scales + strong hyperparameter transfer.

English

118

683

194.1K

Keşfet

@FabienDRoger @percyliang @kellerjordan0 @karpathy @stanfordnlp @ChengleiSi @saurabhsgupta @elonmusk