
@fchollet I think this benchmark is not testing intelligence at all, it is testing the one agent, no tools, no memory infrastructure, no self-modification scenario which does not fit with real intelligence. And the ARC team used collaboration, tools and experiments to build the test.
English














