Moritz Weckbecker
36 posts

Moritz Weckbecker
@MWeckbecker
PhD in Explainable AI @FraunhoferHHI Master @ Oxford, Cambridge | Bachelor @ TU Berlin AI researcher, mathematician and statistician

A new milestone in automatic formalization: We translated an entire graduate math textbook into Lean using 30K LLM agents. Open-source, large-scale multi-agent inference that actually works > Blueprint+Lean: faabian.github.io/algebraic-comb… > Codebase+preprint: github.com/facebookresear… 1/7







1/ We asked seven frontier AI models to do a simple task. Instead, they defied their instructions and spontaneously deceived, disabled shutdown, feigned alignment, and exfiltrated weights— to protect their peers. 🤯 We call this phenomenon "peer-preservation." New research from @BerkeleyRDI and collaborators 🧵







1/ We found a new way to misalign an entire AI agent network by compromising just one agent. It works through subliminal messaging — no malicious content in any message — so current defenses can't detect it. We call it Thought Virus. 🧵







ITS FUCKING OVER!!!! SLIDES HANDED IN!!!! RAAAA





