Antoine Maier

@antmaier

AI Security researcher at General-Purpose AI Policy Lab

Katılım Ekim 2025

10 Takip Edilen4 Takipçiler

Sabitlenmiş Tweet

Antoine Maier@antmaier·21 Eki

⚠️ "When a measure becomes a target, it ceases to be a good measure." But wait, isn't it exactly how we train an AI? What actually happens when you push optimization too far? We went looking for a rigorous answer. None existed. So we built one from first principles. 👇 1/13

English

1.2K

Antoine Maier@antmaier·21 Eki

Read the full paper, it's available here: arxiv.org/abs/2510.02840 Thanks to my co-author Aude Maier, and the team at the General-Purpose AI Policy Lab (@TomDAAVID, @PierrePeigne_, @JyAndreoletti, gpaipolicylab.org)! 13/13

English

Antoine Maier@antmaier·21 Eki

As situational awareness, long-horizon planning (@METR_Evals), and other general capabilities improve (@EpochAIResearch), systems become better at anticipating and countering human intervention, up to a threshold beyond which their performance surpasses human control. 12/

English

Antoine Maier@antmaier·21 Eki

First question: is a model trained on a loss L a minimum of L? ❌No. There are always errors. Regardless of how well L is specified, as long as a non-zero error remains, we know for sure that, strictly speaking, it does not satisfy the learning objective. 3/

English

Antoine Maier@antmaier·21 Eki

Our argument doesn't rely on any specific learning paradigm. Supervised, reinforcement, unsupervised, ... almost every learning task is just 𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻. They all fit into the 'Standard Model of Machine Learning' (@ZhitingHu, @ericxing). 2/

English

Antoine Maier@antmaier·21 Eki

English

1.2K

Keşfet

@TomDAAVID @PierrePeigne_ @JyAndreoletti @METR_Evals @EpochAIResearch @ZhitingHu @ericxing @elonmusk