Louis retweetet

If you're wondering whether saturating ARC-AGI-1 or 2 means we have AGI now... I refer you to what I said when we launched ARC-AGI-2 last year (which is also the same thing I said when we announced ARC-AGI-2 was coming, in Spring 2022, before the rise of LLM chatbots)...
The ARC-AGI series is not an AGI threshold, it's a compass that points the research community toward the right questions.
ARC-AGI-1 is a minimal test of fluid intelligence -- to pass it, you needed to show nonzero fluid intelligence. This required AI to move past the classic deep learning / LLM paradigm of pretraining scaling + static models at inference, toward test-time adaptation.
ARC-AGI-2 is the same, but with tasks that probe deeper levels of reasoning complexity (particularly with regard to concept composition). Still, these are tasks that are solvable in minutes by regular people with no external tool use (we hired our test takers off the street), so it does not represent the upper bound of what human fluid intelligence can achieve (say, solving a Millennium problem).
ARC-AGI-3 (launching March 2026) probes interactive reasoning: we evaluate how systems explore unknown environments, model them, set their own goals, and plan/execute towards these goals, autonomously, without instructions.
We have also started work on ARC-AGI-4 and ARC-AGI-5, which I am pretty excited about!

English





















