Michael Matthews
151 posts

Michael Matthews
@mitrma
PhD student @FLAIR_Ox

Announcing ARC-AGI-3 The only unsaturated agentic intelligence benchmark in the world Humans score 100%, AI <1% This human-AI gap demonstrates we do not yet have AGI Most benchmarks test what models already know, ARC-AGI-3 tests how they learn




After ARC-AGI 3 is saturated there will still be @NetHack_LE / balrogai.com left to conquer.


I'm looking for a PhD intern for next year, co-advised with Scott Fujimoto, for a project developing sample-efficient RL algorithms for long-horizon decision-making. If you've worked on off-policy/MBRL, hierarchical RL, embodied AI, we'd love to hear from you! Contact below.



🪩The one and only @stateofai 2025 is live! 🪩 It’s been a monumental 12 months for AI. Our 8th annual report is the most comprehensive it's ever been, covering what you *need* to know about research, industry, politics, safety and our new usage data. My highlight reel:



🌹 Today we're releasing Unifloral, our new library for Offline Reinforcement Learning! We make research easy: ⚛️ Single-file 🤏 Minimal ⚡️ End-to-end Jax Best of all, we unify prior methods into one algorithm - a single hyperparameter space for research! ⤵️












