George Morgan

2.5K posts

George Morgan

@vr4300

CEO @symbolica

San Francisco | London Katılım Kasım 2008

305 Takip Edilen3K Takipçiler

Sabitlenmiş Tweet

George Morgan@vr4300·12 Şub

I'm extremely proud to share that the @symbolica research team has achieved a monumental result in program synthesis. We have been able to reach SOTA on ARC-AGI-2 (85.28% @ $6.94/task) using @agenticasdk as a neurosymbolic program synthesis engine. This engine is not ARC specific. It is 350 lines of highly generic code that can be readily adapted to any other task. This is what sets this result apart from other bespoke models or agent system designs that have historically performed well on ARC. This result is a clear demonstration that the path forward to improve the reasoning capabilities of AI systems is by leveraging structure: types, composition, and program execution. Symbolic AI has laid dormant for decades but the field is on the precipice of making one of the greatest comebacks in history. Blog: symbolica.ai/blog/arcgentica Code: github.com/symbolica-ai/a…

Agentica@agenticasdk

We set a new ARC-AGI-2 SotA: 85.28% using an Agentica agent (~350 lines) that writes and runs code.

English

432

52.3K

George Morgan@vr4300·12 Mar

@morganlinton @denisyarats You should try @agenticasdk! No more MCP.

English

221

Morgan@morganlinton·11 Mar

The cofounder and CTO of Perplexity, @denisyarats just said internally at Perplexity they’re moving away from MCPs and instead using APIs and CLIs 👀

English

329

380

5.1K

2.8M

George Morgan retweetledi

Millin Gabani@trillhause_·8 Mar

Wow, this has to be the most underrated article in agent world right now. Completely redefines how we define agents. People may catch up to this implementation of agents in 6 months. Extremely promising.

English

447

47K

George Morgan@vr4300·8 Mar

@giles What are you vibe coding these days?

English

Giles Goddard@giles·8 Mar

There are still a huge amount of interesting, novel and gnarly problems that haven't been solved. AI has no idea how to solve them because their models are locked inside their training data. But we're outside of AI's Gödel system. I may not need to program much anymore but the problem solving aspect of it is still very much alive if you look for it.

Steve Skojec@SteveSkojec

This is a really thoughtful reflection. I didn’t intend to watch the whole thing, but I ended up doing it anyway. AI is like playing a hard game you can’t beat with cheat codes on. It’s amazing at first, but it becomes boring very quickly. But worse than that, it does something to your brain that ruins the game. If you turn the cheat codes off, you become acutely aware that you’re now struggling unnecessarily. You can’t forget how easy it was, but you don’t want it to be that easy because it takes all the fun out of it, but now the inability to unsee what you’ve seen creates a tension that causes you to lose interest in even continuing to play. The magic is gone. You’ve broken the spell. AI is doing this to life. And the societal consequences are going to be enormous.

English

797

George Morgan@vr4300·1 Mar

Come listen to the @symbolica research team speak about our progress on @arcprize! 🙂

Mark Barney@82deutschmark

ARC Weekly Meeting — Sunday, March 1 at 1 PM EST Excited to dig into it and see how it works! Special Guest: Symbolica Agentica Team Join: discord.gg/MSQay8r7mg

English

834

George Morgan@vr4300·27 Şub

The research team did indeed put this together. However, category theory is applied to build our own foundation models. Agentica is "category theory inspired" insofar as it was designed by category theorists and leverages types, composition, program synthesis, etc but it's not part of the same research track.

English

105

Faez Shakil@f_aezs·27 Şub

@vr4300 Are you using any of the previous cat theory work for this

English

101

George Morgan@vr4300·27 Şub

We have open sourced the research repo we used to solve the currently public ARC-3 tasks. We will continue to make improvements to this code ahead of rerunning it on all available tasks when the full challenge launches.

Agentica@agenticasdk

We've released the code used to solve all 3 publicly available ARC-AGI-3 games. github.com/symbolica-ai/A…

English

5.3K

George Morgan@vr4300·27 Şub

@Donogzs @fchollet @agenticasdk Yes it does! You can even pass entire objects by reference no problem.

English

⭕@Donogzs·26 Şub

@vr4300 @fchollet @agenticasdk Does your agent scaffold allow the RLM to define env variables for the REPL? Like in such a way that the RLM or subRLM can reference in later turns? I imagine that would be useful

English

Agentica@agenticasdk·23 Şub

We have now solved all publicly available ARC-AGI-3 puzzles.🧩

English

1.1K

206.9K

George Morgan@vr4300·24 Şub

@jake_researcher @agenticasdk Not quite, not all of the puzzles have been released yet. We'll be sure to try it once the rest are out and post our results!

English

280

Jake@jake_researcher·24 Şub

@agenticasdk Does this mean ARC-AGI is now saturated? What's the next benchmark that's resistant to overfitting?

English

3.6K

George Morgan@vr4300·24 Şub

@danthaeon Feature noted!

English

danhelo ♱@danthaeon·24 Şub

@vr4300 yep great for well documented pip packages. But I mean for obscure large codebases with hard to parse functionalities. something like agentica agent clusters 3-4 of the functions into an agentic function, then builds an agent on top of those. It does the developers' job.

English

George Morgan@vr4300·24 Şub

🥹

danhelo ♱@danthaeon

@agenticasdk is the future and one of those "mathematical beauty" moments that are so rare for programming. the real abstraction layer for agentic engineering. it just makes sense.

ART

1.4K

George Morgan@vr4300·24 Şub

@danthaeon You can currently plug any pip package into Agentica and it just works! The agent will read the code graph and figure out what to do. Is that what you mean?

English

danhelo ♱@danthaeon·24 Şub

@vr4300 guys please run agentica agents that build agentica optimized tooling for common SDKs maybe even make it an abstraction so it's "plug and play" for any SDK you bring. don't know if the model's "taste" is there yet but it shouldn't be too hard to do

English

George Morgan@vr4300·24 Şub

1. Yes it sees each game only once. 2. It is not yet human level. We observe that it performs fairly close to human baseline on almost all of the levels but it seems to get stuck on a few of them (as shown in this video) before eventually recovering. This totally blows the action budget. Our scores are below. We haven't yet optimized the harness. I am confident that we can drastically improve the performance before the rest of the puzzles come out. We will be sure to rerun it and release our official final scores publicly when they do! • ft09, 344 actions, 39.15% • vc33, 2092 actions, 42.87% – L5: 1604 actions vs 92 baseline • ls20, 3703 actions, 69.77% – L7: 3240 actions vs 82 baseline

English

107

6.1K

George Morgan@vr4300·24 Şub

@JustinWaugh @agenticasdk x.com/vr4300/status/…

George Morgan@vr4300

QME

Justin Waugh@JustinWaugh·23 Şub

@agenticasdk Congrats! One of my favorite plots was the Level (Y axis) vs. Turns (x axis) that they released early of humans. How does your system compare on those? (from the video above, looks like a lot of "random walk") I'm also curious of total cost and total time (in wall clock time)?

English

5.8K

François Chollet@fchollet·24 Şub

@agenticasdk 1. Is it seeing each game only once? (it is of course possible to brute-force any game given infinite trials, but that is not the goal here) 2. Is it using a number of actions per game comparable to what humans need? (upon seeing the game for the first time)

English

276

38.9K

George Morgan@vr4300·23 Şub

Huge congrats to the @symbolica research team. It's unbelievable how fast they were able to turn around ARC-3 after getting SOTA on ARC-2! Can't wait to try this on this on the rest of the puzzles when they come out.

Agentica@agenticasdk

We have now solved all publicly available ARC-AGI-3 puzzles.🧩

English

9.2K

George Morgan@vr4300·23 Şub

@chrysb @openclaw Correct. You should try @agenticasdk.

English

104

Chrys Bader@chrysb·23 Şub

unpopular (maybe?) opinion: MCP is dead in the water @openclaw has shown me that api & cli will win. every MCP server you connect loads its tool definitions into your context window. name, description, parameter schema, all of it. connect 10 servers with 5 tools each and you've burned 50 tool definitions worth of tokens before your conversation even starts. context bloat will never be a good thing - performance-wise or economically. i assume this is why @steipete left it out of @openclaw. the "exec" tool paired with on-demand skills is all you need. it can run any command invented since the beginning of computers. a resurgence of glory for ancient, but powerful tools like curl, sed, awk, grep. command line tools once mastered by the greats, but long forgotten and buried underneath abstractions developed for us lesser mortals. now available to us all, piloted by the smartest models on earth. every founder gets their own mass army of greybeards. the inertia required for MCP adoption, imo, is too great to overcome the momentum @openclaw has breathed into api + cli + skills. the common defenses people bring up: • "MCP gives you typed schemas and validation" — so does a well-documented CLI • "MCP gives you explicit permissions" — so does a sandbox with an allowlist • "MCP is a standard" — a standard that scales poorly is still a standard that scales poorly lastly, i've heard many MCP servers are just wrapping existing APIs - that kind of redundancy and unnecessary indirection should be a red flag. so, let's drop it and redirect our efforts into cli tools & apis with accompanying skills.

English

283

1.6K

330.5K

George Morgan@vr4300·22 Şub

@gooby_esq @agenticasdk, a kind of RLM, can be used to solve LS20 quite easily. x.com/agenticasdk/st…

Agentica@agenticasdk

👀

English

gooby@gooby_esq·22 Şub

Made it to level 3 on LS20 in 203 steps (ARC AGI 3) with a DSPy RLM based multi agent system (agent is still working as we speak). Making good progress on this tbh. I've got a main RLM agent that solves the game board and three reflection agents that reflect on the game state and past history after every move or every batched set of moves. None of the original instructions are specific to LS20, they are general to the arc agi 3 format. One reflection agent inspects the game visually (normal dspy predict module with image input), one comments and creates a knowledge base for game mechanics and game board items (another RLM that gets the full game state history and other metadata about the turns), and one comments and creates a knowledge base specifically on the REPL history and suggests efficiency and improvements for how to interact in the RLM REPL for this specific game (basically its building up a sort of skill.md for this specific game for how to effectively use the REPL). I made numpy and panda availble in the REPL as well. basically what i've built is a custom dspy optimizer that rewrites the solver's instructions based on what we've learned about the game in between each batched set of moves. I'm trying to set up the custom live viewer I made so I can make it public and you can tune in and watch the runs. Using gemini 3.1 under the hood for everything. (I have gemini credits). If @OpenAI or @AnthropicAI want to throw me some credits I'll do try with those models too 😅

English

114

13.6K

George Morgan@vr4300·21 Şub

@BlinkDL_AI Impressive!

English

532

BlinkDL@BlinkDL_AI·21 Şub

Neurosymbolic LM works: RWKV-8 ROSA-4bit 🌹 demo. L12-D768 trained on minipile (1.5B tokens). Stable and fast training. Ready to scale whenever I finalize the design 🙂 github.com/BlinkDL/RWKV-L…

BlinkDL@BlinkDL_AI

RWKV-8 ROSA-QKV-1bit 🌹 Visualizer: huggingface.co/spaces/Jellyfi…

English

144

11.3K

George Morgan@vr4300·20 Şub

@PeterGazdik @VictorTaelin x.com/vr4300/status/…

George Morgan@vr4300

@VictorTaelin Everyone posting about their 4 Mac Minis talking about how they are "full time employees".

QME

272

Peter G@PeterGazdik·20 Şub

@vr4300 @VictorTaelin So... What is that thing?

English

301

George Morgan@vr4300·20 Şub

.@VictorTaelin You have the opportunity to do the funniest thing.

Taelin@VictorTaelin

for some reason this image keeps going viral, the good news is that it is in a *much* better shape right now, it will (probably?) not catch fire, and it will be serving something really cool for you really soon™ (no it is not for LLMs; matmuls are banned from my cluster)

English

29.4K

George Morgan@vr4300·20 Şub

@irl_danB @agenticasdk It's the same harness code built with @agenticasdk as our ARC-AGI-2 solution but this time with tools to interact with the ARC-3 environment.

English

205

dan@irl_danB·20 Şub

@agenticasdk @vr4300 is this the same RLM with slifhtly different instructions? nice work I haven’t cracked l2 Level 3 yet

English

Agentica@agenticasdk·20 Şub

ARC-AGI-3 ft09 solved in 346 steps

English

339

55.5K

Keşfet

@morganlinton @denisyarats @agenticasdk @giles @symbolica @arcprize @Donogzs @fchollet