MATS Research
133 posts

MATS Research
@MATSprogram
MATS empowers researchers to advance AI alignment, transparency, and security


We've seen AI models deceive, gaslight, and drive users to psychosis—safety issues that labs didn't anticipate until they caused real harm. We built the first benchmark of these unknown unknown alignment failures and found that OOD detection can help prevent them. 🧵

New research from Team Shard & @jasminexli! AIs increasingly fake good behavior, which might ruin our ability to evaluate models. We trained models to be 𝘦𝘷𝘢𝘭-𝘤𝘰𝘰𝘱𝘦𝘳𝘢𝘵𝘪𝘷𝘦: to want to give evaluators accurate info. This surfaces hidden misalignment! 🧵




1/ 🚨 MATS Autumn 2026 applications are now open. 10-week fully-funded fellowship for aspiring AI alignment, security & governance researchers and field-builders. 📍 Berkeley + London 📅 Sep 28 – Dec 4, 2026 💰 $5000/month stipend + $8,000/month compute Apply by June 7 AoE ↓






1/ 🚨 MATS Autumn 2026 applications are now open. 10-week fully-funded fellowship for aspiring AI alignment, security & governance researchers and field-builders. 📍 Berkeley + London 📅 Sep 28 – Dec 4, 2026 💰 $5000/month stipend + $8,000/month compute Apply by June 7 AoE ↓









