
Jax Voss
2.7K posts












Humans: 100% Gemini 3.1 Pro: 0.37% GPT 5.4: 0.26% Opus 4.6: 0.25% Grok-4.20: 0.00% François Chollet just released ARC-AGI-3 -- the hardest AI test ever created. 135 novel game environments. No instructions. No rules. No goals given. Figure it out or fail. Untrained humans solved every single one. Every frontier AI model scored below 1%. Each environment was handcrafted by game designers. The AI gets dropped in and has to explore, discover what winning looks like, and adapt in real time. The scoring punishes brute force. If a human needs 10 actions and the AI needs 100, the AI doesn't get 10%. It gets 1%. You can't throw more compute at this. For context: ARC-AGI-1 is basically solved. Gemini scores 98% on it. ARC-AGI-2 went from 3% to 77% in under a year. Labs spent millions training on earlier versions. ARC-AGI-3 resets the entire scoreboard to near zero. The benchmark launched live at Y Combinator with a fireside between Chollet and Sam Altman. $2M in prizes on Kaggle. All winning solutions must be open-sourced. Scaling alone will not close this gap. We are nowhere near AGI. (Link in the comments)

@kosa64 Ale już na doplaty do elektryków i darmowe parkowanie to się nie narzeka XDDDDDD







@m_wojtkiewicz @kosa64 Tak, bo wszystkie książki na empik go i legimi. Mam zakaz kupowania fizycznych książek. Obok naszego mieszkania w Warszawie jakiś czas temu otworzyli outlet (dosłownie magazyn). Książki średnio po 5 zł, więc miała zaliczane treningi siłowe...Jednak miejsce szybko się skończylo.

@kosa64 ile tam jest metrów do TV? wydaje się, że jeszcze większy TV mógłby być

























