
Antoine Moulin
432 posts

Antoine Moulin
@antoine_mln
doing a phd in RL/online learning on questions related to exploration and adaptivity
Katılım Ağustos 2020
534 Takip Edilen1.4K Takipçiler

@jsuarez If you look at go-explore you may also want to check out arxiv.org/abs/2603.22273 too!
English
Antoine Moulin retweetledi
Antoine Moulin retweetledi

The future of Math is mathematicians and AI agents working together.
Very pleased to introduce @GoogleDeepMind's AI co-mathematician: a multi-agent system designed to actively collaborate with human experts on open-ended research mathematics.
Mathematicians testing the agent across areas as diverse as group theory, Hamiltonian systems, and algebraic combinatorics have reported impressive results.
In autonomous mode evaluation on the rigorous FrontierMath Tier 4 problems, AI co-mathematician scored an unprecedented 48% — a new high score among all AI systems evaluated.

English

@hbouammar @icmlconf When I search for the inverse RL one I only find another paper with the same title :(
English

🚨Very excited to see our work on warmth & sycophancy in LLMs out in @Nature today!🚨
We study what happens when LLMs are fine-tuned to be warmer, and find that warmth and sycophancy can be linked, with warm models showing higher errors on a range of benchmarks (🔗s below)

English

Excited to share that our lab will present two Orals at the ICLR SPOT workshop this Monday:
• Maximum Likelihood Reinforcement Learning (10:10–10:20) — 🏆 Best Paper Award
• Expanding the Capabilities of Reinforcement Learning via Text Feedback (10:20–10:30) — Oral + 🏆 Outstanding Paper Award at LLA Workshop
Come and say hi!
English
Antoine Moulin retweetledi
Antoine Moulin retweetledi

Excited about a couple of papers of ours in ICLR this year (both in Poster Session 1 Pavilion 3 & Oral Session 2B tomorrow):
(1) Sequences of Logits Reveal the Low-Rank Structure of Language Models (joint w/ @axliu42 & @AShettyV) arxiv.org/pdf/2510.24966. 1/n
English

Below is our “textbook” understanding of OPE with value-function approximation. Turns out some of them are not quite right/superficial; guess which ones need to go?
ALL OF THEM!!
IMSI Talk on Wed (not at ICLR): 1 ICLR paper + 1 preprint imsi.institute/activities/the…

English
Antoine Moulin retweetledi


Virginie Bonnaillie-Noël (ENS) et @gabrielpeyre : on a "très clairement une rupture", des post-docs rapportant que leur thèse peut désormais être faite par l'IA en quelques minutes plutôt qu'en 3 ans ; "les performances sont assez incroyables", et "ça change incroyablement vite".
Français
Antoine Moulin retweetledi

Through what mechanisms can reasoning models learn faster by choosing what problems to train on, and what are the limits?
Part I of a new series: "Learning to Reason with Curriculum", where we explore algorithmic principles for overcoming the limitations of pre-trained models and data.
w/ Audrey Huang (@auddery), Miro Dudik (@MiroDudik), Rob Schapire, Dylan Foster (@canondetortugas) and Akshay Krishnamurthy. [1/12]
English

@Yadkori This also means @EurIPSConf is subject to the same rule now that it’s an official satellite event… Surprised the organizers agreed to this!
English

Very disappointing. That’s one less area chair responsibility for me. If I hadn’t already committed to colleagues, I wouldn’t submit a paper this year either.
机器之心 JIQIZHIXIN@jiqizhixin
Breaking: Academic freedom no more. The NeurIPS Foundation has announced it will no longer accept submissions from US-sanctioned institutions.
English














