Sabitlenmiş Tweet
Agentica
65 posts

Agentica
@agenticasdk
Agentica, by @Symbolica. An agent framework for tool use and multi-agent orchestration through arbitrary code execution.
Katılım Ocak 2026
1 Takip Edilen1.9K Takipçiler
Agentica retweetledi

ARC Weekly Meeting — Sunday, March 1 at 1 PM EST
Excited to dig into it and see how it works!
Special Guest: Symbolica Agentica Team
Join: discord.gg/MSQay8r7mg
Agentica@agenticasdk
We've released the code used to solve all 3 publicly available ARC-AGI-3 games. github.com/symbolica-ai/A…
English

@agenticasdk 1. Is it seeing each game only once? (it is of course possible to brute-force any game given infinite trials, but that is not the goal here)
2. Is it using a number of actions per game comparable to what humans need? (upon seeing the game for the first time)
English

We've released the code used to solve all 3 publicly available ARC-AGI-3 games.
github.com/symbolica-ai/A…
English

Join us this Thursday, 26th in London.
We’ll be speaking about ARC-AGI-2 and ARC-AGI-3 at the London AI nerd meetup: what’s new, what’s changing, and what it means for agents.
@its_hapenin @FireworksAI_HQ
luma.com/z4i401h6

English

@vkhosla @agenticasdk You should try ARC-AGI-3 (developer preview is available now, full benchmark coming in a few weeks)
English

Well well… ARC-AGI-2 (François Chollet’s “hardest” benchmark) is starting to smell like toast. 🍞🔥
@agenticasdk just set a new SOTA: 85.28% with an Agentica agent (~350 lines) that writes & runs code.
Best part: it’s not ARC-specialized—it's a general system that’s strong across other benchmarks too. Details at symbolica.ai/blog/arcgentica What benchmark should we throw at it next?
English
Agentica retweetledi

Really cool to see teams already working on ARC-AGI-3
We'll have public replay links for every online run
Agentica@agenticasdk
👀
English
Agentica retweetledi

I tried the ARC-AGI-3 test set, it’s quite well done, but I am sure AI models will solve it before year's end. It looks like it began before it was released ☺️. I suggest they already plan to release ARC-AGI-4 soon!
Agentica@agenticasdk
👀
English

Earlier today I wanted to doom about Gemini 3.1 Pro completely failing ARC-AGI-3.
Turns out this was due to a bug in the config introduced by GPT-5.3. It was still calling Gemini 3.0 Pro instead of 3.1.
I fixed it, made the harness simpler and spend $120.
Performance of Gemini 3.1 Pro is much better than the almost random performance of 3.0 Pro.
Gemini 3.1 Pro can actually solve some games.
Lisan al Gaib@scaling01
English

Vic (@its_hapenin) will chat ARC-AGI-2 results at Engineering Night London - what worked, what didn’t, and what surprised us.
With Agentica:
• 85.28% with Opus 4.6 (120k) High
• +10pp GPT-5.2 (XHigh) vs CoT
• +20pp Opus 4.5 vs CoT
See you there.
Register: luma.com/w5keu427

English
Agentica retweetledi

Funnily enough I tried to dabble with ARC AGI before and with very little success…
Super cool to well designed RLMs achieving SOTA :)
Agentica@agenticasdk
We set a new ARC-AGI-2 SotA: 85.28% using an Agentica agent (~350 lines) that writes and runs code.
English

Reasoning. Persistent state. Recursion.
In our ARC-AGI implementation, agents autonomously decide when and what state to pass into a sub-agent’s REPL, allowing them to focus on analysing training examples, test inputs or both.
Check out the logs github.com/symbolica-ai/a….
English