Luke Bailey

129 posts

Luke Bailey

Luke Bailey

@LukeBailey181

CS PhD student @Stanford. Former CS and Math undergraduate @Harvard.

Katılım Temmuz 2023
325 Takip Edilen442 Takipçiler
Sabitlenmiş Tweet
Luke Bailey
Luke Bailey@LukeBailey181·
Can interpretability help defend LLMs? We find we can reshape activations while preserving a model’s behavior. This lets us attack latent-space defenses, from SAEs and probes to Circuit Breakers. We can attack so precisely that we make a harmfulness probe output this QR code. 🧵
GIF
English
11
83
372
58.5K
Luke Bailey retweetledi
Tengyu Ma
Tengyu Ma@tengyuma·
Maybe the hardest job in 10 years will be to train humans to be super-AI rather than train AI to be super-human. Academia💪!! But .., one needs to believe this task is feasible, and this job itself is not replaceable by AI ... 😂
English
7
6
183
23.6K
Luke Bailey retweetledi
Alex Serrano
Alex Serrano@sertealex·
What if a model could strategically misbehave rarely enough that you'd never catch it during testing? LLMs struggle with calibration in many contexts. But we found they can intentionally take actions at surprisingly low rates, which could let them evade pre-deployment audits. 🧵
Alex Serrano tweet media
English
11
12
95
9K
Luke Bailey retweetledi
Suhas Kotha
Suhas Kotha@kothasuhas·
to improve fine-tuning data efficiency, replay generic pre-training data not only does this reduce forgetting, it actually improves performance on the fine-tuning domain! especially when fine-tuning data is scarce in pre-training (w/ @percyliang)
Suhas Kotha tweet media
English
15
64
498
70.6K
Luke Bailey retweetledi
Tanishq Kumar
Tanishq Kumar@tanishqkumar07·
Fast inference excites me because one AI workload I care a lot about is extremely long-horizon reasoning. Imagine a data center of B200s dedicated entirely to running a model thinking for billions to tokens to prove P vs NP. There, halving latency means thinking twice as hard!
English
1
8
174
20.5K
Luke Bailey retweetledi
Cas (Stephen Casper)
Cas (Stephen Casper)@StephenLCasper·
SOTA agent interest, papers, and products are all accelerating.
Cas (Stephen Casper) tweet media
English
1
1
1
191
Luke Bailey
Luke Bailey@LukeBailey181·
Excited that we are releasing this! The report is much more thorough than our 2024 version, mainly because there are far more AI agents now, and they are being used all over the place.
Cas (Stephen Casper)@StephenLCasper

🚨The 2025 AI Agent Index is out! 🚨 Amidst recent buzz over 🦀 and @NIST’s new agent initiative, we find: - Selective reporting – esp. on safety - Almost all agents backend just 3 model families - Many agents don’t ID themselves as bots online - Big US/China gaps - And more…

English
0
0
2
334
Luke Bailey retweetledi
Tengyu Ma
Tengyu Ma@tengyuma·
I’ve worked with so many talented researchers who have magical intuitions for tuning hyperparameters. Hopefully, one day, models trained on millions of runs from base models that understand DL theory can do the same :). A small sign of life below.
Huaqing Zhang@zhqwqwq

🚀Introducing our new work: Configuration-to-Performance Scaling Law with Neural Ansatz. A language model trained on large-scale pretraining logs can accurately predict how training configurations influence pretraining performance and generalize to runs with 10x more compute.

English
2
10
136
18.8K
Luke Bailey retweetledi
Huaqing Zhang
Huaqing Zhang@zhqwqwq·
🚀Introducing our new work: Configuration-to-Performance Scaling Law with Neural Ansatz. A language model trained on large-scale pretraining logs can accurately predict how training configurations influence pretraining performance and generalize to runs with 10x more compute.
Huaqing Zhang tweet media
English
15
31
162
45.4K
Luke Bailey retweetledi
Jakub Pachocki
Jakub Pachocki@merettm·
Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The problems require expertise in their respective domains and are not easy to verify; based on feedback from experts, we believe at least six solutions (2, 4, 5, 6, 9, 10) have a high chance of being correct, and some further ones look promising. We will only publish the solution attempts after midnight (PT), per the authors' guidance - the sha256 hash of the PDF is d74f090af16fc8a19debf4c1fec11c0975be7d612bd5ae43c24ca939cd272b1a . This was a side-sprint executed in a week mostly by querying one of the models we're currently training; as such, the methodology we employed leaves a lot to be desired. We didn't provide proof ideas or mathematical suggestions to the model during this evaluation; for some solutions, we asked the model to expand upon some proofs, per expert feedback. We also manually facilitated a back-and-forth between this model and ChatGPT for verification, formatting and style. For some problems, we present the best of a few attempts according to human judgement. We are looking forward to more controlled evaluations in the next round! 1stproof.org #1stProof
English
242
357
2.8K
2.5M
Luke Bailey retweetledi
Perry Dong
Perry Dong@perryadong·
Reinforcement learning doesn't scale like supervised learning—yet We introduce Transformer Q-Learning (TQL): a method that unlocks scaling of transformer-based value functions in RL We show that value-based RL can also achieve performance gains through scale (1/7)
Perry Dong tweet media
English
10
38
264
52.9K
Luke Bailey retweetledi
Tengyu Ma
Tengyu Ma@tengyuma·
Long CoTs can be really slow—ChatGPT pro can take >= 30 minutes! We design and train Divide-and-Conquer-CoT with RL, where reasoning models identify distinct parallelizable sub-tasks. Much better latency vs accuracy tradeoff on math benchmarks.  arxiv.org/abs/2601.23027
Tengyu Ma tweet media
English
6
16
77
7.5K
Luke Bailey retweetledi
Axiom
Axiom@axiommathai·
1/ AxiomProver has solved Fel’s open conjecture on syzygies of numerical semigroups, autonomously generating a formal proof in Lean with zero human guidance. This is the first time an AI system has settled an unsolved research problem in theory-building math and self verifies.
Axiom tweet mediaAxiom tweet media
English
87
449
2.4K
1M
Luke Bailey retweetledi
Yoshua Bengio
Yoshua Bengio@Yoshua_Bengio·
Today we’re releasing the International AI Safety Report 2026: the most comprehensive evidence-based assessment of AI capabilities, emerging risks, and safety measures to date. 🧵 (1/17)
English
65
373
1.1K
417.2K
Luke Bailey retweetledi
Zitong Yang
Zitong Yang@ZitongYang0·
We believe AI research is a special research area AI itself can deliver significant progress. Ronald Fisher laid out the foundation for scientific methods: generating hypothesis (research ideas) and see if you can falsify the hypothesis (execution). Therefore, execution and experiments forms the basis of scientific progress. For mathematics, execution is somewhat special, the chain-of-thoughts of the AI models implicitly carries this through, which explains many progresses in math we are seeing. For AI research, execution is purely in code, which AI is extremely at. Idea generation is in natural language, which AI can also do, although not calibrated right now. So, the natural design is to hook up "AI AI idea generator" and "AI AI experiment executor" together end-to-end. This is the future paradigm we propose. We studied two realistic environments: nanoGPT speed run initiated by @kellerjordan0 @karpathy, and GRPO math reasoning homework built by Stanford CS336 @stanfordnlp. These two "research environments" cover the salient and realistic research topics: LLM pretraining and LLM posttraining. For GRPO post-training, the algorithm discovered by AI (see details in the paper!) outperforms the highest scoring student solution (github.com/stanford-cs336…). For nanoGPT, human expert are too good to compete yet, but AI does reduce the time-to-loss=3.28 from 35min to 19min (as a reference, human expert is 2min..) The advantage of AI over human is very simple: AI is tireless. Over the past few month, our wandb log tells us that our AI researcher tries more than 50K research ideas and iteratively learn from the logs of previous experiments. Humans are, of course, more insightful, but simply can't try so many research ideas. Unfortunately, we didn't close the loop where we apply the posttraining algorithm discovered by AI to obtain a stronger model and deploy that model to improve itself even further. This should just be the beginning of execution-grounded automated AI research. Self-improvement (AI inventing better algorithms to train itself) can be an objective for AI on its own. This is an intrinsically motivated goal for AI. This is my last project from Stanford. Deeply grateful to @ChengleiSi for pushing it together and to @saurabhsgupta for funding us.
CLS@ChengleiSi

Can LLMs automate frontier LLM research, like pre-training and post-training? In our new paper, LLMs found post-training methods that beat GRPO (69.4% vs 48.0%), and pre-training recipes faster than nanoGPT (19.7 minutes vs 35.9 minutes). 1/

English
2
10
92
12.5K
Luke Bailey retweetledi
CLS
CLS@ChengleiSi·
Can LLMs automate frontier LLM research, like pre-training and post-training? In our new paper, LLMs found post-training methods that beat GRPO (69.4% vs 48.0%), and pre-training recipes faster than nanoGPT (19.7 minutes vs 35.9 minutes). 1/
CLS tweet media
English
11
141
574
105.4K
Luke Bailey retweetledi
Kaiyue Wen
Kaiyue Wen@wen_kaiyue·
(1/n) Introducing Hyperball — an optimizer wrapper that keeps weight & update norm constant and lets you control the effective (angular) step size directly. Result: sustained speedups across scales + strong hyperparameter transfer.
Kaiyue Wen tweet media
English
27
118
683
194.1K