Zeyu Zheng

30 posts

Zeyu Zheng

@regunivers

PhD Candidate @CarnegieMellon | Seed-Prover | Combinatorial Mathematician, AI Researcher. develop new paradigms for mathematical discovery

Katılım Ocak 2022

71 Takip Edilen113 Takipçiler

Zeyu Zheng retweetledi

Weihua Du@StigLidu·5d

Check out our interactive demo 👇 #demo" target="_blank" rel="nofollow noopener">stiglidu.github.io/AdaExplore/#de… Huge thanks to all collaborators 🙌 @JingmingZhuo @yi_xin_dong @Andre3035858461 @sunweiwei12 @regunivers Manupa Karunaratne, Ivan Fox @Tim_Dettmers @tqchenml Yiming Yang @wellecks We would love to hear your feedback!

English

290

Zeyu Zheng retweetledi

Weihua Du@StigLidu·20 Nis

Check out our #ICLR2026 poster, “Generalizable End-to-End Tool-Use RL with Synthetic CodeGym”! We developed CodeGym, an automated pipeline that converts coding problems into multi-turn tool-use environments for agent RL training. We end up with 13K environments and 80K task configurations. Training on CodeGym can significantly improve LLM tool use and multi-turn interaction capabilities on OOD tasks (e.g., Tau-Bench, ALFWorld). Unfortunately, I can't make it to ICLR in person, so sharing our poster here! Paper: arxiv.org/abs/2509.17325 Dataset: huggingface.co/datasets/Vanis…

Weihua Du@StigLidu

How can we boost LLM agents’ generalizability to OOD tasks and environments? Check out CodeGym, our new project for synthesizing environments for LLM agent RL training. CodeGym is a synthetic environment generation framework for reinforcement learning on multi-turn tool-use tasks. It automatically converts static coding problems into interactive and verifiable RL training environments. Training in CodeGym leads to strong OOD generalization — for example, a Qwen2.5-32B-Instruct model achieved an 8.7-point absolute accuracy gain on τ-Bench! We’ve just released the paper, synthesis pipeline, and dataset: 📄 Paper: arxiv.org/abs/2509.17325 💻 Project: github.com/StigLidu/CodeG… 📊 Dataset: huggingface.co/datasets/Vanis… 📷 More details in the thread👇

English

848

Zeyu Zheng retweetledi

Jared Duker Lichtman@jdlichtman·16 Nis

Erdős was the most prolific mathematician in history, after Euler, But Erdős was by far the most prolific *conjecturer* in history, posing many problems big and small. Combinatorialist Joel Spencer said that Erdős was able generate these conjectures based on deeper meta-theories in his mind. So certainly some of these problems are more famous than others, but it may turn out that if enough of them are solved, it may lead to a more unified theory.

Thomas Bloom@thomasfbloom

Some dismiss Erdős problems as trivialities - this couldn't be further from the truth! While many are amusing novelties, some of them are the most central problems in number theory and combinatorics. A blog post with, in my view, the 10 most important: erdosproblems.com/forum/thread/b…

English

207

18K

Zeyu Zheng retweetledi

Thomas Bloom@thomasfbloom·16 Nis

English

203

35.8K

Zeyu Zheng retweetledi

Jared Duker Lichtman@jdlichtman·15 Nis

In my doctorate, I proved the Erdős Primitive Set Conjecture, showing that the primes themselves are maximal among all primitive sets. This problem will always be in my heart: I worked on it for 4 years (even when my mentors recommended against it!) and loved every minute of it. [Primitive sets are a vast generalization of the prime numbers: A set S is called primitive if no number in S divides another.] Now Erdős#1196 is an asymptotic version of Erdős' conjecture, for primitive sets of "large" numbers. It was posed in 1966 by the Hungarian legends Paul Erdős, András Sárközy, and Endre Szemerédi. I'd been working on it for many years, and consulted/badgered many experts about it, including my mentors Carl Pomerance and James Maynard. The the proof produced by GPT5.4 Pro was quite surprising, since it rejected the "gambit" that was implicit in all works on the subject since Erdős' original 1935 paper. The idea to pass from analysis to probability was so natural & tempting from a human-conceptual point of view, that it obscured a technical possibility to retain (efficient, yet counter-intuitve) analytic terminology throughout, by use of the von Mangoldt function \Lambda(n). The closest analogy I would give would be that the main openings in chess were well-studied, but AI discovers a new opening line that had been overlooked based on human aesthetics and convention. In fact, the von Mangoldt function itself is celebrated for it's connection to primes and the Riemann zeta function--but its piecewise definition appears to be odd and unmotivated to students seeing it for the first time. By the same token, in Erdős#1196, the von Mangoldt weights seem odd and unmotivated but turn out to cleverly encode a fundamental identity \sum_{q|n}\Lambda(q) = \log n, which is equivalent to unique factorization of n into primes. This is the exact trick that breaks the analytic issues arising in the "usual opening". Moreover, Terry Tao has long suspected that the applications of probability to number theory are unnecessarily complicated and this "trick" might actually clarify the general theory, which would have a broader impact than solving a single conjecture.

Boaz Barak@boazbaraktcs

This is one of the coolest such examples! See comments from Lichtman below, who proved the related primitive set conjecture arxiv.org/abs/2202.02384

English

379

2.9K

976K

Zeyu Zheng retweetledi

Chenyu Zhou@ZKBquanter·10 Nis

🚀 New survey dropped! "Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering" LLM agents are evolving — not by changing weights, but by building better infrastructure around them. 🧵👇 1/4

English

171

Zeyu Zheng retweetledi

Tianle Cai@tianle_cai·10 Nis

x.com/i/article/2042…

ZXX

100

644

222.6K

Zeyu Zheng retweetledi

Mehtaab Sawhney@mehtaab_sawhney·9 Nis

We’ve just released another paper solving five further Erdős problems with an internal model at OpenAI: arxiv.org/abs/2604.06609. Several of the proofs were especially enjoyable to digest while writing the paper. My personal favorite was the solution to Erdős Problem 1091. The question asks: if a graph G has chromatic number 4, while every small subgraph has chromatic number at most 3, must it contain an odd cycle with many diagonals? The internal model gives a very enlightening counterexample to this conjecture, and the proof was a pleasure to understand. For those so inclined, a really fun exercise is to try to reconstruct the proof from Figure 5 of the paper, which was of course produced by Codex.

English

151

901

202K

Zeyu Zheng@regunivers·8 Nis

Finally 🥑

Shengjia Zhao@shengjia_zhao

Excited to share what we’ve been building at Meta Superintelligence Labs! We just released Muse Spark, our first AI model. It's a natively multimodal reasoning model and the first step on our path to personal superintelligence. We've overhauled our entire stack to support scaling, and this is just the beginning. ai.meta.com/blog/introduci…

English

Zeyu Zheng@regunivers·3 Nis

@FabianGloeckle Amazing work!

English

628

Zeyu Zheng retweetledi

Fabian Gloeckle@FabianGloeckle·3 Nis

A new milestone in automatic formalization: We translated an entire graduate math textbook into Lean using 30K LLM agents. Open-source, large-scale multi-agent inference that actually works > Blueprint+Lean: faabian.github.io/algebraic-comb… > Codebase+preprint: github.com/facebookresear… 1/7

English

138

704

88.5K

Zeyu Zheng retweetledi

Timothy Gowers @wtgowers@wtgowers·1 Nis

What a difference a year makes.

Timothy Gowers @wtgowers@wtgowers

It's finally happened: after several unsuccessful attempts, I found a prompt that got Grok to solve a maths problem (the well-known Dubnovy Blazen problem in graph theory) I've been working on for over a year. How long till it's better than human mathematicians across the board?

English

294

45.1K

Zeyu Zheng retweetledi

Dan Roy@roydanroy·4 Mar

How are mathematicians facing the wave of rapidly advancing AI-for-math capabilities? Jeremy Avigad (CMU prof and co-author on the original 2015 system description paper for Lean) just posted a paper with his thoughts in the wake of the Math, Inc. announcement on sphere packing. andrew.cmu.edu/user/avigad/Pa… There are a lot of interesting passages in here, including a bit of the back story of the Math, Inc. bomb drop and how it was initially received by the humans working on the formalization project. But, as for how mathematics proceeds, here's the key last passage: "We need to remember our strengths: mathematicians are problem solvers and theory builders extraordinaire. Rather than fight the use of AI in mathematics, we should own it. It is not enough to keep up with current events and design benchmarks for AI researchers; we need to play an active role in deploying the technology and molding it to our purposes. We also need to learn how to raise our students with the wisdom to use the new technologies appropriately, and we need to be careful that we still manage to impart core mathematical intuitions and understanding. Figuring out how to use AI effectively to achieve our mathematical goals won’t be easy, but mathematicians have always embraced challenges—indeed, the harder, the better. If we face AI head-on and stay true to our values, mathematics will thrive. We just need to show up and get to work." The next few years should be a golden era for mathematics. For those of us working on the frontier, I hope we do well by our mathematician colleagues.

English

192

851

108.2K

Zeyu Zheng retweetledi

Mehtaab Sawhney@mehtaab_sawhney·25 Şub

We just posted a paper solving Erdos #846, which was solved by an internal model at OpenAI (cdn.openai.com/infinite-sets/…). While the problem can also be derived from an earlier paper in the literature, the proof by the internal model was one of the first instances where I smiled reading the proof.

English

313

75.4K

Zeyu Zheng retweetledi

Zhaopeng Tu@tuzhaopeng·25 Şub

Does an AI agent really need to "think hard" for every single step? 🤖🧠⚡ Introducing CogRouter, a framework grounded in ACT-R theory that trains agents to dynamically adapt cognitive depth across four hierarchical levels, from instinctive responses to strategic planning. 🧠 We identify a critical inefficiency in current agents where they either think too little (reflexive) or think too much (uniform deep reasoning). Real-world tasks demand step-wise heterogeneity: 1️⃣ Routine steps require instinctive responses; 2️⃣ Complex steps require strategic reasoning; 3⃣ Fixed patterns waste tokens or fail on hard tasks. 🎓 We propose a two-stage training strategy to enable dynamic cognitive adaptation: 1️⃣ Cognition-aware SFT (CogSFT) instills stable level-specific patterns; 2️⃣ Cognition-aware Policy Optimization (CoPO) enables step-level credit assignment via confidence-aware advantage reweighting. 📊 Experiments on ALFWorld and ScienceWorld show CogRouter achieves state-of-the-art performance with superior efficiency: 🏆 82.3% success rate with Qwen2.5-7B; 🔥 Outperforms GPT-4o (+40.3%) and OpenAI-o3 (+18.3%); ⚡ Uses 62% fewer tokens than GRPO by skipping unnecessary reasoning. 🧑‍💻 Code: github.com/rhyang2021/Cog… 📷 Paper: arxiv.org/abs/2602.12662

English

9.2K

Zeyu Zheng retweetledi

Yujia Qin@TsingYoga·17 Şub

Happy CNY! We are glad to introduce our latest language model Seed-2.0. We make great progress (agent, reasoning, vision understanding, etc.) since Seed-1.8 without any distillation Right now it's only available in CN now, and will soon be ready globally. seed.bytedance.com/en/seed2

English

184

14K

Zeyu Zheng@regunivers·15 Şub

Really happy to share Seed 2.0, the strongest general-purpose model for formal mathematics to date. Vibe coding is already here. Vibe proofing is right around the corner. Model card: lf3-static.bytednsdoc.com/obj/eden-cn/la…

Thomas Zhu@hanwen_zhu

I want to point to an aspect of Seed2.0's ability—Seed2.0 is the first general LLM to incorporate agentic formal math capability, even surpassing Seed-Prover 1.5. I tested "vibe-proving" on several real-world problems in our Trae IDE.

English

2.3K

Zeyu Zheng@regunivers·9 Oca

@AcerFur @fleetingbits That said, my personal perspective is still that the point at which the community largely converged on a clear, consensus formulation of the problem was around January 5. This is not meant to diminish the result at all, just to clarify how I was thinking about the timeline.

English

216

Zeyu Zheng@regunivers·9 Oca

@AcerFur @fleetingbits Thanks for the clarification. I fully agree that this is a very meaningful result, and personally I think that if this contribution had been made by a human, it would absolutely be worth publishing.

English

209

Zeyu Zheng@regunivers·8 Oca

I have seen several recent claims that AI has solved open Erdős problems. I took a closer look and summarized what I found here. zeyu-zheng.github.io/blog/erdősConj…

English

12.7K

Zeyu Zheng@regunivers·9 Oca

@fleetingbits Whether this “counts” depends on perspective. One could say AI solved a genuinely new problem, but one could also note that the corrected statement had been open for only about a day. My personal view is that if this contribution were made by a human, it would be worth publishing

English

335

Zeyu Zheng@regunivers·9 Oca

@fleetingbits Facts: The original statement of 728 was ambiguous, and AI quickly found a trivial counterexample. The community then discussed how to fix the statement and proposed an additional hypothesis. Only after this corrected formulation was identified did people use AI to solve it.

English

730

Keşfet

@JingmingZhuo @yi_xin_dong @Andre3035858461 @sunweiwei12 @Tim_Dettmers @tqchenml @wellecks @FabianGloeckle