Lin Yang

201 posts

Lin Yang

@lyang36

Associate Professor of ECE&CS@UCLA. ML, RL, big data, algorithms, astronomy.

Los Angeles, CA Katılım Ekim 2011

1.1K Takip Edilen3.3K Takipçiler

Sabitlenmiş Tweet

Lin Yang@lyang36·27 Ağu

Our IMO gold medal-winning AI pipeline is now model-agnostic. 🥇 What worked for Gemini 2.5 Pro now gets the same 5/6 score with GPT-5 & Grok4. This confirms the power of our verification-and-refinement pipeline to improve base model capabilities. The new code & results are live on GitHub[github.com/lyang36/IMO25]! Paper update coming soon. Huge thanks to @xai for the Grok4 API credits! #AI #LLM #IMO #MathOlympiad #OpenSource”

English

1.1K

129.6K

Lin Yang retweetledi

Watcher.Guru@WatcherGuru·4d

JUST IN: Bitcoin crashes under $69,000 after President Trump threatens to "obliterate" Iran's power plants if Strait of Hormuz is not opened within 48 hours.

English

802

1.2K

10.9K

1.3M

Lin Yang@lyang36·30 Oca

Our new work in ICLR2026: a one-shot hardware-aligned pruning approach that achieves SOTA performance!

🚨 New paper accepted at #ICLR2026. 🚨 We introduce ARMOR — a one-shot, hardware-aligned pruning method that dramatically outperforms existing semi-structured pruning while keeping real inference speedups. Paper Link: openreview.net/forum?id=8NE55… 👇 Thread ↓

English

694

Lin Yang@lyang36·28 Oca

@chijinML Congrats!

English

1.1K

Chi Jin@chijinML·28 Oca

Life update🙂: I’m on sabbatical from Princeton and have started at OpenAI, working on building AGI. Happy to be back in the Bay Area after 6 years! Bay Area friends—DMs open for food & hikes.

English

658

63.9K

Lin Yang@lyang36·19 Oca

@FrankYueWu1 Best wishes, Yue!

English

600

Yue Wu@FrankYueWu1·18 Oca

Wrapping up a chapter at xAI. To my colleagues: thank you to everyone who has helped me along the way. I truly enjoyed the deep technical discussions and appreciate all the opportunities I was given. Compared to a year ago, I’ve grown beyond anything I expected. I’m sure our paths will cross again. To potential candidates: xAI has some of the most advanced RL systems for language models today. Scaling them from first principles is extremely challenging and deeply rewarding. If you’re early in your career and get the chance, don’t hesitate. xAI is a place where you can take on huge scope and hard work is genuinely recognized. It has been a blast, my friends.

English

519

60.1K

Lin Yang@lyang36·6 Ara

I am in favor of this. It will strengthen the quality of the papers and help reduce the community’s reviewing burden.

Gautam Kamath@thegautamkamath

IJCAI 2026 will charge $100 USD per submission. Funds will be used to compensate reviewers.

English

1.3K

Lin Yang retweetledi

Gautam Kamath@thegautamkamath·4 Ara

IJCAI 2026 will charge $100 USD per submission. Funds will be used to compensate reviewers.

English

351

88.3K

Lin Yang retweetledi

Ming Jin@MingJin_AI·13 Kas

Huge thanks to Dr. Lin Yang (@lyang36 ) for his fantastic talk last week at the AI Agent Frontier Seminar! 🥇 He gave a fascinating look at how his team used a self-verification pipeline to win gold at the IMO 2025. In case you missed it, the full recording is now available: agentic-ai-frontier-seminar.github.io/past.html #AI #LLMs #Reasoning #IMO

English

807

Lin Yang retweetledi

Thang Luong@lmthang·5 Kas

A bit of a history of IMO-Bench and our IMO efforts: a. We started building IMO-Bench around early 2024, which was the precursor of ProofBench (basic). b. IMO-Bench was first mentioned in the Gemini 1.5 paper around May 2024. At that time, Gemini Math-specialized 1.5 Pro scored only 25% (whereas the performance on Hendrycks’s MATH was 81%, a big breakthrough!). c. Fast forward to April 2025, Gemini 2.5 Pro scored 55% on ProofBench (basic) and 35% on ProofBench (advanced). Nobody talked about the MATH benchmark anymore (it has served its purpose well!). d. Also in April 2025, our paper was accepted to ACL 2025. We were asking leadership to share it publicly, first to arXiv, but we were asked to wait until IMO 2025, so we postponed ACL. At the time, it felt like a major setback to me because it was uncertain at the time if we were going to surpass ourselves at IMO 2025 (we got Silver at IMO 2024). We kept marching on. e. At IMO 2025, our generalist model (advanced Gemini Deep Think), scored 89.0% on ProofBench (basic) and 66% on ProofBench (advanced). And the rest was history 🙂 In case people missed it, this project page has all the infos about IMO-Bench imobench.github.io.

Thang Luong@lmthang

IMO-ProofBench is our key focus designed to evaluate the ability of AI models in constructing rigorous and valid mathematical arguments. With 60 proof-based problems, the benchmark is divided into two subsets: a basic set covering pre-IMO to IMO-Medium difficulty levels, and an advanced set featuring novel, highly challenging problems simulating complete IMO examinations, up to IMO-Hard level. Our goal for the basic set is to assess models in their early stages of development. Sufficiently strong performance on the basic set would justify progression to the advanced set. Performances on the basic IMO-ProofBench varies significantly: while Gemini Deep Think (IMO Gold) achieves a high score of 89.0%, most models score below 60%, indicating that there is still considerable room for improvements. The advanced IMO-ProofBench proves to be a more significant challenge that all non-Gemini models score below 25%. Our IMO-gold model achieved a state-of-the art score of 65.7% according to human evaluations. This represents a substantial leap in capability, but its distance from a perfect score indicates that even the strongest models have room for growth in sophisticated mathematical reasoning.

English

140

16.3K

Lin Yang@lyang36·5 Kas

RT @PeckShieldAlert: #PeckShieldAlert As $BTC dumps below $101K, a whale using Aave to long $WBTC has been wiped out. The position, which…

English

115

Lin Yang retweetledi

Ming Jin@MingJin_AI·3 Kas

Excited for this week's AI Agent Frontier Seminar! We're thrilled to host @lyang36 from UCLA. His topic: "Winning Gold at IMO 2025 with a Model-Agnostic Self-Verification Pipeline." 🥇 He'll discuss how agentic strategies can solve complex problems that direct prompting can't. Join us this Friday, Nov 7, at 9 AM PT / 12 PM ET. All are welcome! Details & Zoom: agentic-ai-frontier-seminar.github.io #AI #LLMs #Reasoning #IMO #AIagents

English

1.6K

Lin Yang retweetledi

Yu-Xiang Wang@yuxiangw_cs·25 Eki

🚀 We just set a new SOTA for LLM inference acceleration with speculative decoding. By corralling a band of specialist drafters, we got 4.99× on Llama-3.1-8B-Instruct, 4.93× on Qwen-32B — beating EAGLE3 by nearly 2x. No gimmicks. Just careful math + solid engineering. 🧵1/

English

332

34.5K

Lin Yang@lyang36·23 Eki

Thanks Steve for the invitation. It’s a great pleasure to speak at Manifold.

AIs Win Math Olympiad Gold: Prof. Lin Yang (UCLA) – Manifold #97 Lin Yang is a professor of computer science at UCLA. Recently, he and his collaborator built an AI pipeline using commercial models such as Gemini, ChatGPT, and Grok that performed at the gold medal level on International Mathematics Olympiad problems. Steve and Lin discuss this research, which relies on "verifier-refiner" LLM instances and large token budgets to reliably solve difficult problems. They discuss how these methods can be used to advance AI for scientific research, legal analysis, and complex document processing. (00:00) - AIs Win Math Olympiad Gold: Prof. Lin Yang (UCLA) – #97 (00:57) - Prof. Lin Yang, UCLA (04:27) - Journey from Physics to Computer Science: 2 PhDs (11:15) - Transition to AI from Theoretical CS (13:16) - AI Pipeline Math Olympiad: Gold Medal! (28:23) - Probability Amplification (29:00) - Applications in Industry and Legal Analysis (29:58) - Challenges in Model Reasoning and Verification (33:23) - Future of AI in Scientific Research and AGI Speculations

English

1.8K

Lin Yang retweetledi

@·18 Eki

C.N. Yang has passed, age 103. Yang was awarded the Nobel prize at 35, for parity violation (shared with T.D. Lee). But his greatest contribution was probably Yang-Mills theory, now referred to as gauge theory. When I was a student the former designation was as common as the

English

516

3.1K

311.5K

Lin Yang@lyang36·17 Eki

@MengdiWang10 🎊🎊

QME

364

Mengdi Wang@MengdiWang10·17 Eki

🚀 Introducing LabOS: The AI-XR Co-Scientist A system that sees, understands, and works with humans in real-world labs. 👁️ Egocentric vision & extended reality 🧠 LLM reasoning & hypothesis generation 🤖 Real-time guidance & multi-modal human-AI collaboration From observation → understanding → collaboration. Preprint: arxiv.org/abs/2510.14861

English

161

33.5K

Lin Yang@lyang36·8 Eki

@MengdiWang10 What if using another LLM to revise?

English

144

Mengdi Wang@MengdiWang10·8 Eki

AI wrote it. Humans edited it. We caught them both. 😎 Our new paper shows how to trace edits between AI and human hands — even after serious remixing! 👉 “Detecting Post-generation Edits to AI-Generated Text” arxiv.org/abs/2510.01637 #AI #NLP #LLMs #AITransparency #AIResearch #TrustworthyAI #GenerativeAI

English

5.2K

Lin Yang retweetledi

Watcher.Guru@WatcherGuru·3 Eki

JUST IN: $BNB reaches new ATH of $1,100

English

346

405

4.2K

237.3K

Lin Yang@lyang36·2 Eki

@chijinML Congratulations!

English

350

Chi Jin@chijinML·2 Eki

Excited to share that I’ve been promoted to Associate Professor with tenure at Princeton!🎉 6 years may not be long, but AI research has evolved significantly during this period. Grateful to all my students, collaborators, colleagues for being with me on this remarkable journey!

English

149

2.7K

113.9K

Lin Yang@lyang36·25 Eyl

@aminkarbasi @andromeda74356 Let’s do it!

English

122

Amin Karbasi@aminkarbasi·25 Eyl

@andromeda74356 @lyang36 and I co-authored a paper in the past. Maybe we should do it again.

English

226

Amin Karbasi@aminkarbasi·24 Eyl

This started as a fun personal project. The sample is small, so nothing definitive, but a few patterns emerged when we put GPT-5 to the test. • When the path was clear, it did great: nearly correct proofs in 3/5 problems. • On Problem 2, it surprised us with a new approximation guarantee that overturned our original conjecture and still solved the problem. • Often adapted known proofs well, but with a “copy-paste” flavor — skipping unchanged steps instead of exploring natural alternatives. • Hit a wall on Problems 4 and 5, both needing cross-technique reasoning. Integrative thinking is clearly still tough. • On Problem 5, it even picked the right algorithm, but could not analyze it properly. The guarantee might exist, just harder than we guessed. • Compared to older generations, GPT-5 feels sharper in math and occasionally original — small sparks worth noting. • Prompting matters a lot. Asking for full proofs made its work more complete and self-contained. Better prompts could move the needle even more. • The failures looked polished at first glance but hid deep flaws. A reminder that these models can sound right while being very wrong. We did not test peers (@xai, @Google, @anthropic). Proof-checking is labor intensive, so that part is for another day/team.

Sebastien Bubeck@SebastienBubeck

It's becoming increasingly clear that gpt5 can solve MINOR open math problems, those that would require a day/few days of a good PhD student. Ofc it's not a 100% guarantee, eg below gpt5 solves 3/5 optimization conjectures. Imo full impact of this has yet to be internalized...

English

141

27.2K

Lin Yang retweetledi

Jiantao Jiao@JiantaoJ·2 Eyl

🚀 We’re hiring at NVIDIA! Our team is pushing the frontier of LLM / DLM post-training and system optimization. We are looking for exceptional people with large-scale LLM + systems experience to join us (full time only). 🔹 Focus areas include: •Post-training of large models •Systems for LLM/DLM training & inference at scale •Efficiency, scaling, and evaluation frameworks of LLMs At NVIDIA, you’ll work with world-class researchers and engineers on cutting-edge foundation models at unprecedented scale. 👉 If you’re passionate about LLMs, systems, and building the next generation of AI, we’d love to hear from you. 📩 If you’re interested, please send me your CV! @nvidia #LLM #AI #Systems #PostTraining #DeepLearning

English

471

103.3K

Lin Yang@lyang36·27 Ağu

@AtaeiMe I was calling the model manually with huggingface

English

812

Mehdi Ataei@AtaeiMe·27 Ağu

@lyang36 Is it based on respnses api? I have seen the model to work with with it.

English

832

Lin Yang@lyang36·27 Ağu

English

1.1K

129.6K

Keşfet

@chijinML @FrankYueWu1 @PeckShieldAlert @MengdiWang10 @elonmusk @BarackObama @taylorswift13 @cristiano