Lechao Xiao

190 posts

Lechao Xiao

Lechao Xiao

@Locchiu

Research Scientist @GoogleDeepMind / Google Brain. Tackle scaling, along the path to AGI.

New York, NY Katılım Eylül 2009
619 Takip Edilen1.3K Takipçiler
Junyang Lin
Junyang Lin@JustinLin610·
me stepping down. bye my beloved qwen.
English
1.7K
738
13.6K
6.5M
Hieu Pham
Hieu Pham@hyhieu226·
I have made the difficult decision to leave @OpenAI. Working here and at @xai before was a once-in-a-lifetime experience. I have met the best people. Not the best people in AI. Not the best people in tech. Simply the best people. At these companies, I have helped creating extremely intelligent entities that will meaningfully improve our lives. The work makes me proud. But the intensive work came with a price. I cannot believe I would say this one day, but I am burnt out. All the mental health deteriorating that I used to scoff at is real, miserable, scary, and dangerous. I am going to take a break from frontier AI labs, and will take my family to my home country Vietnam. There, I will try something new, and also search for a cure for my conditions. I hope I will heal. Until then.
English
1.1K
418
14K
1.2M
JC Haswell
JC Haswell@JCHaswell·
@TheGregYang @Locchiu I was in excellent shape my whole life leading up to lyme, don't have a pre-lyme numbers but last couple years HRV has averaged 15-20 and HR 70+.
English
2
0
0
118
Greg Yang
Greg Yang@TheGregYang·
my HRV and RHR have steadily worsened since october kinda strange since my life was much more stressful before october
Greg Yang tweet mediaGreg Yang tweet media
English
42
1
256
34.5K
Lechao Xiao
Lechao Xiao@Locchiu·
@TheGregYang In my case, it correlates with exercise / sleep quite well ; exercise helps to improve both. Magnesium also boost hrv for 3-5 pts for me. No alcohol also helps 😂
English
1
0
1
518
Greg Yang
Greg Yang@TheGregYang·
@Locchiu I didn't do much at all till September and then stopped after an injury in october no change in magnesium I can recall
English
2
0
3
2.2K
Lechao Xiao
Lechao Xiao@Locchiu·
@Jianlin_S Thanks, Jianlin! Hope you have a wonderful Chinese New Year!
English
0
0
0
281
jianlin.su
jianlin.su@Jianlin_S·
Beyond MuP: 2. Linear Layers and Steepest Descent kexue.fm/archives/11605 The last blog post before the 2026 Spring Festival. Happy Chinese New Year!
English
5
30
252
37.7K
Lechao Xiao retweetledi
trieu
trieu@thtrieu_·
Mathematicians 🤝AI researchers arxiv.org/abs/2601.22401. Our take on AI solving Erdos problems: * Many "Open" problems are actually just obscure: many cases the AI didn't find something new, only rediscovered solutions buried in the literature. We present our systematic approach to reporting AI results on Erdos. * The real bottleneck is still human labor, e.g. we spent lots of time filtering out technically correct but meaningless solutions (AI missed Erdos’s original intent). * Acceleration in solving low-hanging fruits is real, but we also need to highlight the many more misses that require human auditing. Clear research directions ahead though, and we feel optimistic about drastically increasing the signal-to-noise ratio. More to come!
Thang Luong@lmthang

Here's the paper link to our scaled effort for tackling Erdős problems. We started with 700 problems marked ‘Open’ in the database. Our agent #Aletheia identified potential solutions to 200 problems. Initial human grading revealed 63 correct answers, followed by deep expert evaluation and discussion to eventually arrive at meaningful proofs to 13 Erdős problems. arxiv.org/abs/2601.22401

English
9
35
207
29.5K
Lechao Xiao
Lechao Xiao@Locchiu·
@peterjliu It feels like the beginning of vibe proof. With human in the loop as verifiers.
English
0
0
1
71
Lechao Xiao
Lechao Xiao@Locchiu·
@peterjliu ”My preference would still be for the final writeup for this result to be primarily human-generated in the most essential portions of the paper, though I can see a case for delegating routine proofs to some combination of AI-generated text and Lean code. But to me, the ... ”
English
1
0
0
87
Peter J. Liu
Peter J. Liu@peterjliu·
Wow, Terrence Tao is basically saying AI can produce acceptable math research papers: "This resulted in a new writeup of the proof drive.google.com/file/d/1MRQfcH… that had less of the feel of a generic AI-produced document, and which I judge to be at a level of writing within ballpark of an acceptable standard for a research paper, although there is still room for further improvement." @tao/115855840223258103" target="_blank" rel="nofollow noopener">mathstodon.xyz/@tao/115855840…
English
1
0
4
865
Dan Roy
Dan Roy@roydanroy·
In mid-January, I’ll join Google DeepMind’s Science unit as a Visiting Research Scientist, on leave from the University of Toronto. I'm excited to be joining Google DeepMind's efforts to accelerate mathematical research with AI.
English
30
6
302
13.6K
Dan Roy
Dan Roy@roydanroy·
Big announcement time... Today is my last day as Research Director at the Vector Institute. It has been my incredible privilege over the past 2.5 years to serve the Vector community and help build an institution that supports world-class ML research and real-world impact.
English
36
10
606
55K
rohan anil
rohan anil@_arohan_·
I did a bad job naming Shampoo and its variants since 2018. We obviously went deep into various aspects of preconditioning but my colleagues were insistent on the shampoo brand. Now its clear to me that we should have named it: shampoo-pro-ultra-max-high shampoo-lite-medium-high Merry Christmas!
English
5
3
202
20.6K
Lechao Xiao retweetledi
Amr Khalifa
Amr Khalifa@AmrMAlameen·
I am hiring a student researcher to work with our team in Montreal on LLMs architecture and pre-training in spring-summer 2026, if you're excited to push the frontier of research forward, join us to help keeping the TPUs warm. fill out this form: forms.gle/1AfdyCbzjdKi2y…
English
11
42
471
35.9K
elie
elie@eliebakouch·
> best open-weight LLM by a US company this is cool but i’m not sure about emphasizing the “US” part since the base model is deepseek V3
Drishan Arora@drishanarora

Today, we are releasing the best open-weight LLM by a US company: Cogito v2.1 671B. On most industry benchmarks and our internal evals, the model performs competitively with frontier closed and open models, while being ahead of any US open model (such as the best versions of OpenAI’s GPT-OSS, Nvidia’s Nemotron and Meta’s Llama). We also built an interface where you can try the model (it’s free and we don’t store any chats): chat.deepcogito.com Additionally, you can download the model on @huggingface, or try it out on @openrouter, @togethercompute, @FireworksAI_HQ , @ollama cloud, @runpod, @baseten, or run it locally using @ollama or @UnslothAI. This model uses significantly fewer tokens amongst any similar capability models, because it has better reasoning capabilities. You will also notice improvements across instruction following, coding, longer queries, multi-turn and creativity. 📌 Model Weights: huggingface.co/collections/de… 📌Openrouter: openrouter.ai/deepcogito/cog… 📌 HF Blog: huggingface.co/blog/deepcogit… Some notes on our approach + design choices below 👇

English
11
6
255
25.1K
Lechao Xiao
Lechao Xiao@Locchiu·
First scaling law: performance follows a power law of compute, with its exponent governed by science and engineering. Second scaling law: the total improvement of this law follows a power law of resource, with its exponent governed by vision and conviction.
English
0
0
5
757
Lechao Xiao
Lechao Xiao@Locchiu·
@vinaysrao Congrats Vinay, this is super cool! I tried to find the tokenizer and vocab size, but couldn't find it in the paper. do you mind sharing them (also good to update this info in the paper) ?
English
1
0
1
447
Vinay S Rao
Vinay S Rao@vinaysrao·
While at Meta, I worked on this optimizer-wrapper (outer step lookahead momentum) we're calling Snoo (arxiv.org/abs/2510.15830). You can use it with AdamW or Muon and see really strong scaling. Here's a plot where we ran it against (tuned) AdamW up to 1e23 training flop scales. The "x"s in the plot are compute-factors i.e the baseline needs "x" more flops to reach the same loss (instead of simply measuring in steps). - We further established a medium-track WR on modded-nanogpt (github.com/KellerJordan/m…) With amazing co-authors (Dominik,Vishal,Michael).
Vinay S Rao tweet media
English
6
23
234
19.6K
Lechao Xiao
Lechao Xiao@Locchiu·
@hyhieu226 lol, analysts are the last defenders of human intelligence
English
0
0
0
343
Hieu Pham
Hieu Pham@hyhieu226·
A friend of mine won an IMO gold, went on to obtain a PhD in algebraic topology, and now works at a frontier AI lab. This guy, however, doesn't know how to do integration by parts. He knows the principle, but treats those tricks as below him. AI models today give the same vibe.
English
64
123
3.4K
285.5K
Lechao Xiao
Lechao Xiao@Locchiu·
@Andrea__M Do you mind summarizing your insight about “1980 nonparametric statistics” vs “ scaling laws” theory paper? We can then begin with this, brain storm new research directions/ basic questions. This can turn into a fruitful discussion on theory of scaling.
English
0
0
3
534
Andrea Montanari
Andrea Montanari@Andrea__M·
Honest question. What "scaling laws" theory papers that are not a variation on 1980s nonparametric statistics?
English
6
8
134
16K
William Fedus
William Fedus@LiamFedus·
Today, @ekindogus and I are excited to introduce @periodiclabs. Our goal is to create an AI scientist. Science works by conjecturing how the world might be, running experiments, and learning from the results. Intelligence is necessary, but not sufficient. New knowledge is created when ideas are found to be consistent with reality. And so, at Periodic, we are building AI scientists and the autonomous laboratories for them to operate. Until now, scientific AI advances have come from models trained on the internet. But despite its vastness — it’s still finite (estimates are ~10T text tokens where one English word may be 1-2 tokens). And in recent years the best frontier AI models have fully exhausted it. Researchers seek better use of this data, but as any scientist knows: though re-reading a textbook may give new insights, they eventually need to try their idea to see if it holds. Autonomous labs are central to our strategy. They provide huge amounts of high-quality data (each experiment can produce GBs of data!) that exists nowhere else. They generate valuable negative results which are seldom published. But most importantly, they give our AI scientists the tools to act. We’re starting in the physical sciences. Technological progress is limited by our ability to design the physical world. We’re starting here because experiments have high signal-to-noise and are (relatively) fast, physical simulations effectively model many systems, but more broadly, physics is a verifiable environment. AI has progressed fastest in domains with data and verifiable results - for example, in math and code. Here, nature is the RL environment. One of our goals is to discover superconductors that work at higher temperatures than today's materials. Significant advances could help us create next-generation transportation and build power grids with minimal losses. But this is just one example — if we can automate materials design, we have the potential to accelerate Moore’s Law, space travel, and nuclear fusion. We’re also working to deploy our solutions with industry. As an example, we're helping a semiconductor manufacturer that is facing issues with heat dissipation on their chips. We’re training custom agents for their engineers and researchers to make sense of their experimental data in order to iterate faster. Our founding team co-created ChatGPT, DeepMind’s GNoME, OpenAI’s Operator (now Agent), the neural attention mechanism, MatterGen; have scaled autonomous physics labs; and have contributed to some of the most important materials discoveries of the last decade. We’ve come together to scale up and reimagine how science is done. We’re fortunate to be backed by investors who share our vision, including @a16z who led our $300M round, as well as @Felicis, DST Global, NVentures (NVIDIA’s venture capital arm), @Accel and individuals including @JeffBezos , @eladgil , @ericschmidt, and @JeffDean. Their support will help us grow our team, scale our labs, and develop the first generation of AI scientists.
William Fedus tweet media
English
429
443
4.2K
3.5M