Jonathan Lee

42 posts

Jonathan Lee

@jon_lee0

research @GoogleDeepMind. co-developed gemini deep think. co-led model training for IMO 🥇 | prev: RL PhD at @StanfordAILab

Mountain View, CA Katılım Temmuz 2025

140 Takip Edilen920 Takipçiler

Sabitlenmiş Tweet

Jonathan Lee@jon_lee0·21 Tem

I’m excited to share the news of Gemini Deep Think’s gold-medal level performance 🥇 at the International Math Olympiad! It has been an absolute blast building Deep Think this year and then scaling it to the IMO.

Google DeepMind@GoogleDeepMind

An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵

English

105

24.6K

Jonathan Lee@jon_lee0·6 Mar

@yubai01 @OpenAI congrats!

English

141

Yu Bai@yubai01·5 Mar

🧄GPT-5.4 is here. 🚀 If you have felt the step-change improvement in GPT-5.3-Codex, GPT-5.4 brings a similar and bigger improvement into ChatGPT, API, and Codex as a unified model. Super proud of what the @OpenAI team has achieved together!

OpenAI@OpenAI

GPT-5.4 Thinking and GPT-5.4 Pro are rolling out now in ChatGPT. GPT-5.4 is also now available in the API and Codex. GPT-5.4 brings our advances in reasoning, coding, and agentic workflows into one frontier model.

English

197

12.2K

Jonathan Lee@jon_lee0·25 Şub

We ran our internal system Aletheia (Deep Think) on FirstProof’s research problems during the week they were released. Aletheia returned solutions to problems 2, 5, 7, 8, 9, and 10. We think there’s a pretty good chance they are correct, based on expert analysis.

Thang Luong@lmthang

Thrilled to share: #Aletheia, our math research agent, just solved 6/10 notoriously hard FirstProof problems autonomously, the best result in the inaugural challenge! To me, this is even bigger than our historic IMO-gold achievement last year; these problems challenge even top mathematicians. We share our results transparently, see paper and full thoughts in the thread. 👇

English

134

10.7K

Jonathan Lee retweetledi

Thang Luong@lmthang·15 Şub

Yes, we provided 3 things for AI-assisted math: * Human-AI interaction (HAI) card (photo), inspired by model cards * Full transcripts github.com/google-deepmin… * A label for novelty-autonomy, inspired by SAE Levels of autonomy, see #Aletheia paper arxiv.org/abs/2602.10177

Daniel Litt@littmath

Really good question (note that DeepMind shared transcripts in their recent Aletheia paper, and I think this is clearly best practice). Hopefully OAI follows suit.

English

123

15K

Jonathan Lee retweetledi

Thang Luong@lmthang·12 Şub

Congrats to the whole Deep Think team from @GoogleDeepMind for this amazing milestone of #DeepThink V2 launch! Such a great a model that powers so many state-of-the-art results from reasoning (ARC-AGI2) to deep knowledge (Humanity's Last Exam), multimodality (MMMU-Pro), coding (Codeforces), the math research agent #Aletheia, and scientific discovery (that we shared just yesterday)! Blog: blog.google/innovation-and… It has been a privilege witnessing the relentless progress 🔥: * ChatGPT -> Bard announcement (Mar 2023): 100 days * Announcement of IMO-gold achievement -> DeepThink v1 launch (Jul 2025): 10 days * Announcement of Aletheia agent & advancements in scientific research -> Deep Think v2 launch (Feb 2026): 1 day More to come! Stay tuned!

English

308

22.5K

Jonathan Lee retweetledi

Yi Tay@YiTayML·12 Şub

Gemini 3 Deep Think is here! 😎 This model is not only super strong in math and coding (IMO gold and 3455 codeforces ELO), it is also gold standard in physics and chemistry olympiads. 😃 Also sets new records on ARC-AGI-2 and HLE. Proud to be a (core) member of the Deep Think team. 🦾😆. Feeling the AGI!

English

331

15.8K

Jonathan Lee@jon_lee0·12 Şub

cool new model

English

1.8K

Jonathan Lee@jon_lee0·12 Şub

Our latest versions of Deep Think are helping accelerate math research. Our new paper dives into examples of the agents semi-autonomously (and sometimes autonomously) contributing new knowledge.

Thang Luong@lmthang

Research-level mathematics draws on advanced techniques from vast literature, with papers often spanning dozens of pages. While foundation models possess a large knowledge base from pretraining, their understanding of advanced subjects remains superficial due to data scarcity, and they are also prone to hallucinations. As such, in the first paper, "Towards Autonomous Mathematics Research", we built #Aletheia (ancient Greek word for "Truth"), a math research agent, that can iteratively generate, verify, and revise solutions end-to-end in natural language. Link to the paper: github.com/google-deepmin… (to be on arXiv soon!) There are 3 main sources that power Aletheia ...

English

1.1K

Jonathan Lee@jon_lee0·17 Ara

@jasondeanlee Glad to hear it, really appreciate all your feedback!

English

669

Jason Lee@jasondeanlee·17 Ara

Thank you gdm for the gemini Ultra. The deep think with gemini 3 is surprisingly good and really fast compared gpt pro. Being 5 or 10 x faster means faster iteration which more important than smart but limited to one shot. with 10 prompts, always get the deep think >> 5.2 pro

English

109

16.7K

Jonathan Lee@jon_lee0·5 Ara

this is so cool

Legit@legit_api

Gemini 3 Deep Think - original music it simulated an orchestra, wow

English

1.8K

Jonathan Lee retweetledi

Taelin@VictorTaelin·5 Ara

For these wondering, and as expected, Gemini 3 Deep Think solves the stack underflow bug that cost me a few days. The answer is a more decisive than Opus 4.5, the only other public model to solve it (even Gemini 3 Pro fails). It even points the exact location confidently. It takes forever though... I don't have harder tests for now, most my benchmarks are saturated and I'm super busy with SupGen stuff, so that's all I have to say about this one

English

787

52.1K

Jonathan Lee retweetledi

Google Gemini@GeminiApp·4 Ara

Gemini 3 Deep Think is here. Deep Think is our most advanced reasoning mode that explores multiple hypotheses simultaneously to give you an even more sophisticated output.

English

407

750

7.7M

Jonathan Lee@jon_lee0·18 Kas

Our best reasoning model from IMO and ICPC is getting even better, and it won't stop here :)

Logan Kilpatrick@OfficialLoganK

And say hello to Gemini 3 Deep Think, even more SOTA compared to Gemini 3 Pro 🤯

English

961

Jonathan Lee retweetledi

Quoc Le@quocleix·18 Kas

Gemini 3 Deep Think is next level. Deep Think was the the engine behind our gold medal-level wins at IMO and ICPC, and now powers an even stronger version of Gemini 3. SOTA above SOTA. More to come soon!

English

426

92.3K

Jonathan Lee retweetledi

Thang Luong@lmthang·4 Kas

Continuing our IMO-gold journey, I’m delighted to share our #EMNLP2025 paper “Towards Robust Mathematical Reasoning”, which tells some of the key stories behind the success of our advanced Gemini #DeepThink at this year IMO. Finding the right north-star metrics was highly critical for our IMO effort and we did it with #IMOBench, a suite of advanced reasoning benchmarks for foundation models. More importantly, we encourage the community to go beyond short answers and showed that automatic grading of long-form answers is promising! Read on to see our project page, paper, and datasets in the thread 🙂

Thang Luong@lmthang

Very excited to share that an advanced version of Gemini Deep Think is the first to have achieved gold-medal level in the International Mathematical Olympiad! 🏆, solving five out of six problems perfectly, as verified by the IMO organizers! It’s been a wild run to lead this effort and I am grateful to everyone in the team for such an amazing achievement! Blog post in the thread and more to share soon!

English

107

711

187.5K

Jonathan Lee retweetledi

Epoch AI@EpochAIResearch·9 Eki

We evaluated Gemini 2.5 Deep Think on FrontierMath. There is no API, so we ran it manually. The results: a new record! We also conducted a more holistic evaluation of its math capabilities. 🧵

English

632

148.2K

Jonathan Lee retweetledi

Annie Xie@_anniexie·26 Eyl

Super excited to share Gemini Robotics 1.5!! Our high-level reasoning model Gemini Robotics-ER 1.5 is also publicly available now! The model is particularly strong at spatial and temporal reasoning, and can use thinking to improve its answers 🧠🤖

Google DeepMind@GoogleDeepMind

We’re making robots more capable than ever in the physical world. 🤖 Gemini Robotics 1.5 is a levelled up agentic system that can reason better, plan ahead, use digital tools such as @Google Search, interact with humans and much more. Here’s how it works 🧵

English

1.5K

Jonathan Lee retweetledi

Ted Xiao@xiao_ted·25 Eyl

📢The next milestone for intelligent general-purpose robots has arrived! Announcing Gemini Robotics 1.5, our flagship system which brings breakthroughs from frontier models to the physical world with two new SOTA generalists: the GR 1.5 VLA and GR 1.5 embodied reasoning model 🧵

English

187

26.3K

Jonathan Lee@jon_lee0·18 Eyl

Amazing work by the team! Another gold-medal level performance from a version of Deep Think 🚀

Google DeepMind@GoogleDeepMind

An advanced version of Gemini 2.5 Deep Think has achieved gold-medal level performance at the ICPC 2025 - one of the world’s most prestigious programming contests. 🏅 Building on the model's success in math at the IMO, this marks another historic milestone for advanced AI. 🧵

English

2.4K

Jonathan Lee retweetledi

Heng-Tze Cheng@HengTze·17 Eyl

I’m excited to announce that an advanced version of Gemini Deep Think achieved gold-medal level performance at the 2025 ICPC World Finals, one of the world’s most prestigious programming competitions! 🥇Learn more in our blog post: bit.ly/46rvjLs An inspiring moment for me personally was when our model solved a problem that no university team solved during the contest — a true moment of innovation. With Gemini Deep Think achieving gold-level across ICPC & IMO, I think we’re seeing a profound leap in generalization across coding, math and reasoning capabilities, to generate novel solutions to complex problems. This is a huge milestone for us on an amazing journey. Really grateful and proud of our team, for all the hard work and teamwork that made this breakthrough possible. Looking forward to continuing our research, helping people use Gemini to solve some of the hardest unsolved problems in the world!

English

328

40.7K

Jonathan Lee retweetledi

Dan Hendrycks@hendrycks·4 Eyl

Few people are aware of how good Gemini Deep Think is. It's at the point where "Should I ask an expert to chew on this or Deep Think?" is often answered with Deep Think. GPT-5 Pro is more "intellectual yet idiot" while Deep Think has better taste. I've been repeating this a lot frequently so deciding to tweet it instead.

English

535

57.3K

Keşfet

@yubai01 @OpenAI @GoogleDeepMind @jasondeanlee @elonmusk @BarackObama @taylorswift13 @cristiano