
Zeyu Zheng
30 posts

Zeyu Zheng
@regunivers
PhD Candidate @CarnegieMellon | Seed-Prover | Combinatorial Mathematician, AI Researcher. develop new paradigms for mathematical discovery



How can we boost LLM agents’ generalizability to OOD tasks and environments? Check out CodeGym, our new project for synthesizing environments for LLM agent RL training. CodeGym is a synthetic environment generation framework for reinforcement learning on multi-turn tool-use tasks. It automatically converts static coding problems into interactive and verifiable RL training environments. Training in CodeGym leads to strong OOD generalization — for example, a Qwen2.5-32B-Instruct model achieved an 8.7-point absolute accuracy gain on τ-Bench! We’ve just released the paper, synthesis pipeline, and dataset: 📄 Paper: arxiv.org/abs/2509.17325 💻 Project: github.com/StigLidu/CodeG… 📊 Dataset: huggingface.co/datasets/Vanis… 📷 More details in the thread👇

Some dismiss Erdős problems as trivialities - this couldn't be further from the truth! While many are amusing novelties, some of them are the most central problems in number theory and combinatorics. A blog post with, in my view, the 10 most important: erdosproblems.com/forum/thread/b…


This is one of the coolest such examples! See comments from Lichtman below, who proved the related primitive set conjecture arxiv.org/abs/2202.02384



Excited to share what we’ve been building at Meta Superintelligence Labs! We just released Muse Spark, our first AI model. It's a natively multimodal reasoning model and the first step on our path to personal superintelligence. We've overhauled our entire stack to support scaling, and this is just the beginning. ai.meta.com/blog/introduci…



It's finally happened: after several unsuccessful attempts, I found a prompt that got Grok to solve a maths problem (the well-known Dubnovy Blazen problem in graph theory) I've been working on for over a year. How long till it's better than human mathematicians across the board?








I want to point to an aspect of Seed2.0's ability—Seed2.0 is the first general LLM to incorporate agentic formal math capability, even surpassing Seed-Prover 1.5. I tested "vibe-proving" on several real-world problems in our Trae IDE.







