Valentin Thomas
99 posts

Valentin Thomas
@_valthomas
technical person @cohere, PhD from Mila. Formerly @layer6AI, @deepmind. Interested in RL, reasoning, ICL.


Geoffrey Hinton says mathematics is a closed system, so AIs can play it like a game. They can pose problems to themselves, test proofs, and learn from what works, without relying on human examples. “I think AI will get much better at mathematics than people, maybe in the next 10 years or so.”


Second-order methods and preconditioner-based methods are **NOT** the same. Please stop using them interchangeably!











🪂Understanding R1-Zero-Like Training: A Critical Perspective * DeepSeek-V3-Base already exhibits "Aha moment" before RL-tuning?? * The ever-increasing output length in RL-tuning might be due to a BIAS in GRPO?? * Getting GRPO Done Right, we achieve a 7B AIME sota! 🧵 📜Full details: github.com/sail-sg/unders… 🛠️Code: github.com/sail-sg/unders…











As a taxpayer (irrespective of whether you’re a scientist) would you would be in favor of more of the @NIH budget going to fund efforts to solve specific diseases at the expense of basic exploratory research? Which diseases?









