Lin Yang

201 posts

Lin Yang

Lin Yang

@lyang36

Associate Professor of ECE&CS@UCLA. ML, RL, big data, algorithms, astronomy.

Los Angeles, CA Katılım Ekim 2011
1.1K Takip Edilen3.3K Takipçiler
Sabitlenmiş Tweet
Lin Yang
Lin Yang@lyang36·
Our IMO gold medal-winning AI pipeline is now model-agnostic. 🥇 What worked for Gemini 2.5 Pro now gets the same 5/6 score with GPT-5 & Grok4. This confirms the power of our verification-and-refinement pipeline to improve base model capabilities. The new code & results are live on GitHub[github.com/lyang36/IMO25]! Paper update coming soon. Huge thanks to @xai for the Grok4 API credits! #AI #LLM #IMO #MathOlympiad #OpenSource
English
19
84
1.1K
129.6K
Lin Yang retweetledi
Watcher.Guru
Watcher.Guru@WatcherGuru·
JUST IN: Bitcoin crashes under $69,000 after President Trump threatens to "obliterate" Iran's power plants if Strait of Hormuz is not opened within 48 hours.
Watcher.Guru tweet mediaWatcher.Guru tweet media
English
802
1.2K
10.9K
1.3M
Lin Yang
Lin Yang@lyang36·
Our new work in ICLR2026: a one-shot hardware-aligned pruning approach that achieves SOTA performance!
@

🚨 New paper accepted at #ICLR2026. 🚨 We introduce ARMOR — a one-shot, hardware-aligned pruning method that dramatically outperforms existing semi-structured pruning while keeping real inference speedups. Paper Link: openreview.net/forum?id=8NE55… 👇 Thread ↓

English
0
0
5
694
Chi Jin
Chi Jin@chijinML·
Life update🙂: I’m on sabbatical from Princeton and have started at OpenAI, working on building AGI. Happy to be back in the Bay Area after 6 years! Bay Area friends—DMs open for food & hikes.
Chi Jin tweet mediaChi Jin tweet mediaChi Jin tweet media
English
37
10
658
63.9K
Yue Wu
Yue Wu@FrankYueWu1·
Wrapping up a chapter at xAI. To my colleagues: thank you to everyone who has helped me along the way. I truly enjoyed the deep technical discussions and appreciate all the opportunities I was given. Compared to a year ago, I’ve grown beyond anything I expected. I’m sure our paths will cross again. To potential candidates: xAI has some of the most advanced RL systems for language models today. Scaling them from first principles is extremely challenging and deeply rewarding. If you’re early in your career and get the chance, don’t hesitate. xAI is a place where you can take on huge scope and hard work is genuinely recognized. It has been a blast, my friends.
English
37
11
519
60.1K
Lin Yang retweetledi
Gautam Kamath
Gautam Kamath@thegautamkamath·
IJCAI 2026 will charge $100 USD per submission. Funds will be used to compensate reviewers.
Gautam Kamath tweet media
English
15
39
351
88.3K
Lin Yang retweetledi
Thang Luong
Thang Luong@lmthang·
A bit of a history of IMO-Bench and our IMO efforts: a. We started building IMO-Bench around early 2024, which was the precursor of ProofBench (basic). b. IMO-Bench was first mentioned in the Gemini 1.5 paper around May 2024. At that time, Gemini Math-specialized 1.5 Pro scored only 25% (whereas the performance on Hendrycks’s MATH was 81%, a big breakthrough!). c. Fast forward to April 2025, Gemini 2.5 Pro scored 55% on ProofBench (basic) and 35% on ProofBench (advanced). Nobody talked about the MATH benchmark anymore (it has served its purpose well!). d. Also in April 2025, our paper was accepted to ACL 2025. We were asking leadership to share it publicly, first to arXiv, but we were asked to wait until IMO 2025, so we postponed ACL. At the time, it felt like a major setback to me because it was uncertain at the time if we were going to surpass ourselves at IMO 2025 (we got Silver at IMO 2024). We kept marching on. e. At IMO 2025, our generalist model (advanced Gemini Deep Think), scored 89.0% on ProofBench (basic) and 66% on ProofBench (advanced). And the rest was history 🙂 In case people missed it, this project page has all the infos about IMO-Bench imobench.github.io.
Thang Luong@lmthang

IMO-ProofBench is our key focus designed to evaluate the ability of AI models in constructing rigorous and valid mathematical arguments. With 60 proof-based problems, the benchmark is divided into two subsets: a basic set covering pre-IMO to IMO-Medium difficulty levels, and an advanced set featuring novel, highly challenging problems simulating complete IMO examinations, up to IMO-Hard level. Our goal for the basic set is to assess models in their early stages of development. Sufficiently strong performance on the basic set would justify progression to the advanced set. Performances on the basic IMO-ProofBench varies significantly: while Gemini Deep Think (IMO Gold) achieves a high score of 89.0%, most models score below 60%, indicating that there is still considerable room for improvements. The advanced IMO-ProofBench proves to be a more significant challenge that all non-Gemini models score below 25%. Our IMO-gold model achieved a state-of-the art score of 65.7% according to human evaluations. This represents a substantial leap in capability, but its distance from a perfect score indicates that even the strongest models have room for growth in sophisticated mathematical reasoning.

English
5
14
140
16.3K
Lin Yang retweetledi
Ming Jin
Ming Jin@MingJin_AI·
Excited for this week's AI Agent Frontier Seminar! We're thrilled to host @lyang36 from UCLA. His topic: "Winning Gold at IMO 2025 with a Model-Agnostic Self-Verification Pipeline." 🥇 He'll discuss how agentic strategies can solve complex problems that direct prompting can't. Join us this Friday, Nov 7, at 9 AM PT / 12 PM ET. All are welcome! Details & Zoom: agentic-ai-frontier-seminar.github.io #AI #LLMs #Reasoning #IMO #AIagents
English
0
1
10
1.6K
Lin Yang retweetledi
Yu-Xiang Wang
Yu-Xiang Wang@yuxiangw_cs·
🚀 We just set a new SOTA for LLM inference acceleration with speculative decoding. By corralling a band of specialist drafters, we got 4.99× on Llama-3.1-8B-Instruct, 4.93× on Qwen-32B — beating EAGLE3 by nearly 2x. No gimmicks. Just careful math + solid engineering. 🧵1/
Yu-Xiang Wang tweet media
English
14
48
332
34.5K
Lin Yang
Lin Yang@lyang36·
Thanks Steve for the invitation. It’s a great pleasure to speak at Manifold.
@

AIs Win Math Olympiad Gold: Prof. Lin Yang (UCLA) – Manifold #97 Lin Yang is a professor of computer science at UCLA. Recently, he and his collaborator built an AI pipeline using commercial models such as Gemini, ChatGPT, and Grok that performed at the gold medal level on International Mathematics Olympiad problems. Steve and Lin discuss this research, which relies on "verifier-refiner" LLM instances and large token budgets to reliably solve difficult problems. They discuss how these methods can be used to advance AI for scientific research, legal analysis, and complex document processing. (00:00) - AIs Win Math Olympiad Gold: Prof. Lin Yang (UCLA) – #97 (00:57) - Prof. Lin Yang, UCLA (04:27) - Journey from Physics to Computer Science: 2 PhDs (11:15) - Transition to AI from Theoretical CS (13:16) - AI Pipeline Math Olympiad: Gold Medal! (28:23) - Probability Amplification (29:00) - Applications in Industry and Legal Analysis (29:58) - Challenges in Model Reasoning and Verification (33:23) - Future of AI in Scientific Research and AGI Speculations

English
1
0
12
1.8K
Lin Yang retweetledi
@·
C.N. Yang has passed, age 103. Yang was awarded the Nobel prize at 35, for parity violation (shared with T.D. Lee). But his greatest contribution was probably Yang-Mills theory, now referred to as gauge theory. When I was a student the former designation was as common as the
 tweet media
English
56
516
3.1K
311.5K
Mengdi Wang
Mengdi Wang@MengdiWang10·
🚀 Introducing LabOS: The AI-XR Co-Scientist A system that sees, understands, and works with humans in real-world labs. 👁️ Egocentric vision & extended reality 🧠 LLM reasoning & hypothesis generation 🤖 Real-time guidance & multi-modal human-AI collaboration From observation → understanding → collaboration. Preprint: arxiv.org/abs/2510.14861
Mengdi Wang tweet media
English
10
25
161
33.5K
Lin Yang retweetledi
Watcher.Guru
Watcher.Guru@WatcherGuru·
JUST IN: $BNB reaches new ATH of $1,100
Watcher.Guru tweet mediaWatcher.Guru tweet media
English
346
405
4.2K
237.3K
Chi Jin
Chi Jin@chijinML·
Excited to share that I’ve been promoted to Associate Professor with tenure at Princeton!🎉 6 years may not be long, but AI research has evolved significantly during this period. Grateful to all my students, collaborators, colleagues for being with me on this remarkable journey!
Chi Jin tweet media
English
149
58
2.7K
113.9K
Amin Karbasi
Amin Karbasi@aminkarbasi·
This started as a fun personal project. The sample is small, so nothing definitive, but a few patterns emerged when we put GPT-5 to the test. • When the path was clear, it did great: nearly correct proofs in 3/5 problems. • On Problem 2, it surprised us with a new approximation guarantee that overturned our original conjecture and still solved the problem. • Often adapted known proofs well, but with a “copy-paste” flavor — skipping unchanged steps instead of exploring natural alternatives. • Hit a wall on Problems 4 and 5, both needing cross-technique reasoning. Integrative thinking is clearly still tough. • On Problem 5, it even picked the right algorithm, but could not analyze it properly. The guarantee might exist, just harder than we guessed. • Compared to older generations, GPT-5 feels sharper in math and occasionally original — small sparks worth noting. • Prompting matters a lot. Asking for full proofs made its work more complete and self-contained. Better prompts could move the needle even more. • The failures looked polished at first glance but hid deep flaws. A reminder that these models can sound right while being very wrong. We did not test peers (@xai, @Google, @anthropic). Proof-checking is labor intensive, so that part is for another day/team.
Sebastien Bubeck@SebastienBubeck

It's becoming increasingly clear that gpt5 can solve MINOR open math problems, those that would require a day/few days of a good PhD student. Ofc it's not a 100% guarantee, eg below gpt5 solves 3/5 optimization conjectures. Imo full impact of this has yet to be internalized...

English
6
8
141
27.2K
Lin Yang retweetledi
Jiantao Jiao
Jiantao Jiao@JiantaoJ·
🚀 We’re hiring at NVIDIA! Our team is pushing the frontier of LLM / DLM post-training and system optimization. We are looking for exceptional people with large-scale LLM + systems experience to join us (full time only). 🔹 Focus areas include: •Post-training of large models •Systems for LLM/DLM training & inference at scale •Efficiency, scaling, and evaluation frameworks of LLMs At NVIDIA, you’ll work with world-class researchers and engineers on cutting-edge foundation models at unprecedented scale. 👉 If you’re passionate about LLMs, systems, and building the next generation of AI, we’d love to hear from you. 📩 If you’re interested, please send me your CV! @nvidia #LLM #AI #Systems #PostTraining #DeepLearning
English
22
33
471
103.3K
Lin Yang
Lin Yang@lyang36·
@AtaeiMe I was calling the model manually with huggingface
English
1
0
0
812
Mehdi Ataei
Mehdi Ataei@AtaeiMe·
@lyang36 Is it based on respnses api? I have seen the model to work with with it.
English
1
0
0
832
Lin Yang
Lin Yang@lyang36·
Our IMO gold medal-winning AI pipeline is now model-agnostic. 🥇 What worked for Gemini 2.5 Pro now gets the same 5/6 score with GPT-5 & Grok4. This confirms the power of our verification-and-refinement pipeline to improve base model capabilities. The new code & results are live on GitHub[github.com/lyang36/IMO25]! Paper update coming soon. Huge thanks to @xai for the Grok4 API credits! #AI #LLM #IMO #MathOlympiad #OpenSource
English
19
84
1.1K
129.6K