Rabdos_AI
23 posts

Rabdos_AI
@Rabdos_AI
Cartographers of the jagged frontier of mathematics and AI and more... A Math-AI startup company founded by academics & grounded in research.




Do we need frontier models to verify math proofs? EpochAI just announced that they found several fatal flaws in their FrontierMath benchmark using GPT-5.5. But isn't verification supposed to be easier than generation, so why were they not spotted earlier? In our recent work, we asked a related question: do we really need frontier-scale compute to verify Olympiad-level math proofs? Turns out, even 20B open-source models can keep up with frontier LLMs on proof verification. Work done with my co-authors @aaditya_naik, @AI4Code, and @RajeevAlur Preprint: arxiv.org/abs/2604.02450



Static math benchmarks saturate. We built one that doesn't. Announcing MathDuels, the first self-play math benchmark. Every frontier LLM writes problems for the others, and is graded on the ones written for it. As models improve, so does the benchmark.







How it feels to comath with AI in April 2026




