Dries Smit
153 posts

Dries Smit
@DriesSmit1
Researcher at Tufa Labs, working on fundamental AI research. https://t.co/2HNB9DxLWj


Today we're launching ARC-AGI-3 135 Novel Environments (nearly 1K levels) we build by hand It is the only unsaturated agent benchmark in the world Each game is 100% human solvable, AI scores <1% This gap between human and AI performance proves we do not have AGI Agents today need human handholding. Agents that beat V3 will prove they don’t need that level of supervision. Agents that beat V3 will demonstrate: * Continual learning - Each level builds on top of each other. You can’t beat level 3 without carrying forward what you learned in levels 1 and 2. * World modeling - Many of the environments require planning actions many actions ahead. AI will have no choice but to build an internal world model for how the environment works, run simulations “in its head” and proceed with an action In our early testing, we’ve seen a few clear failure modes of AI: * Anticipation of future events - If an environment requires that AI set up a scene, and then carry out a scenario (like in sp80), it starts to break down. * Anchoring on early hypothesis - Early in a game it comes up with a hypothesis (even if wrong) and refuses to update its beliefs later. * Thinking it’s playing another game - AI thinks it’s playing chess, pacman. The training data holds hard! One major problem is there is too much data to carry forward in a single context. Models must learn what to remember and what to forget The agent that beats ARC-AGI-3 will have demonstrated the most authoritative evidence of progress towards general intelligence to date We're excited to get this out and excited to see what you think


We scored 36.08% on ARC-AGI-3 in one day using the Agentica SDK.






@jm_alexia @PourcelJulien @cedcolas @pyoudeyer @LiaoIsaac91893 @JFPuget @dvhrtm @JDisselh ARC Prize 2025 / Top Score Winner Third Place / MindsAI @ Tufa Labs @MindsAI_Jack, @DriesSmit1, @MohamedOsmanML, @bayesilicon Interview: youtu.be/3lXXfNsWIgo


people forget about the Baldwin effect when reasoning about the difference between evolution and learning when you recapitulate learning that your ancestors did, when you learn to see, walk, and speak in an unbroken line for thousands of generations, it gets easier each time


Great writeup by Dr. Dries Smit about his winning approach to the ARC 3 Preview competition. @DriesSmit1 @arcprize @dries.epos/1st-place-in-the-arc-agi-3-agent-preview-competition-49263f6287db" target="_blank" rel="nofollow noopener">medium.com/@dries.epos/1s…

Learning #3: The collective intelligence of the community is great. We hosted a Agent Preview Competition and awarded prizes to @DriesSmit1 @MindsAI_Jack and Will Dick github.com/wd13ca












