Agentica

65 posts

Agentica banner
Agentica

Agentica

@agenticasdk

Agentica, by @Symbolica. An agent framework for tool use and multi-agent orchestration through arbitrary code execution.

Katılım Ocak 2026
1 Takip Edilen1.9K Takipçiler
Sabitlenmiş Tweet
Agentica
Agentica@agenticasdk·
We set a new ARC-AGI-2 SotA: 85.28% using an Agentica agent (~350 lines) that writes and runs code.
English
8
49
555
180.9K
François Chollet
François Chollet@fchollet·
@agenticasdk 1. Is it seeing each game only once? (it is of course possible to brute-force any game given infinite trials, but that is not the goal here) 2. Is it using a number of actions per game comparable to what humans need? (upon seeing the game for the first time)
English
9
6
276
38.9K
Agentica
Agentica@agenticasdk·
We have now solved all publicly available ARC-AGI-3 puzzles.🧩
English
40
76
1.1K
206.9K
Agentica
Agentica@agenticasdk·
Join us this Thursday, 26th in London. We’ll be speaking about ARC-AGI-2 and ARC-AGI-3 at the London AI nerd meetup: what’s new, what’s changing, and what it means for agents. @its_hapenin @FireworksAI_HQ luma.com/z4i401h6
Agentica tweet media
English
2
3
22
1.6K
Vinod Khosla
Vinod Khosla@vkhosla·
Well well… ARC-AGI-2 (François Chollet’s “hardest” benchmark) is starting to smell like toast. 🍞🔥 @agenticasdk just set a new SOTA: 85.28% with an Agentica agent (~350 lines) that writes & runs code. Best part: it’s not ARC-specialized—it's a general system that’s strong across other benchmarks too. Details at symbolica.ai/blog/arcgentica What benchmark should we throw at it next?
English
19
31
294
54K
Agentica retweetledi
Greg Kamradt
Greg Kamradt@GregKamradt·
Really cool to see teams already working on ARC-AGI-3 We'll have public replay links for every online run
Agentica@agenticasdk

👀

English
5
8
146
11.5K
Agentica
Agentica@agenticasdk·
ARC-AGI-3 ft09 solved in 346 steps
English
11
30
339
55.5K
Agentica retweetledi
Derya Unutmaz, MD
Derya Unutmaz, MD@DeryaTR_·
I tried the ARC-AGI-3 test set, it’s quite well done, but I am sure AI models will solve it before year's end. It looks like it began before it was released ☺️. I suggest they already plan to release ARC-AGI-4 soon!
Agentica@agenticasdk

👀

English
5
13
186
15.6K
Lisan al Gaib
Lisan al Gaib@scaling01·
Earlier today I wanted to doom about Gemini 3.1 Pro completely failing ARC-AGI-3. Turns out this was due to a bug in the config introduced by GPT-5.3. It was still calling Gemini 3.0 Pro instead of 3.1. I fixed it, made the harness simpler and spend $120. Performance of Gemini 3.1 Pro is much better than the almost random performance of 3.0 Pro. Gemini 3.1 Pro can actually solve some games.
Lisan al Gaib@scaling01

x.com/i/article/2024…

English
13
15
665
77.4K
Agentica
Agentica@agenticasdk·
Many people have asked us: what changes when an agent has access to a persistent Python runtime? We ran a side-by-side comparison to demonstrate: Agentica's Python REPL-based agent vs traditional tool calling agents Full breakdown below 👇
English
2
3
32
2.9K
Agentica
Agentica@agenticasdk·
Vic (@its_hapenin) will chat ARC-AGI-2 results at Engineering Night London - what worked, what didn’t, and what surprised us. With Agentica: • 85.28% with Opus 4.6 (120k) High • +10pp GPT-5.2 (XHigh) vs CoT • +20pp Opus 4.5 vs CoT See you there. Register: luma.com/w5keu427
Agentica tweet media
English
0
2
22
1.8K
Agentica
Agentica@agenticasdk·
Reasoning. Persistent state. Recursion. In our ARC-AGI implementation, agents autonomously decide when and what state to pass into a sub-agent’s REPL, allowing them to focus on analysing training examples, test inputs or both. Check out the logs github.com/symbolica-ai/a….
English
1
9
93
5.3K