sam galanakis
27 posts



Blog: The Agent Is a Workflow That Writes Itself How we productionized the RLM with durable execution. Subagents lower to child workflows and PTC runs through a deterministic workflow-space interpreter. Every tool, subagent, and PTC call goes through a single recursive dispatch loop. Closure under replay, retry, cancel. We call these durable agents.














The attention on LongCoT is great! It's far from solved (GPT 5.2 w/out tools gets 9.8%). Out-of-the-box, a GPT 5.2 RLM gets 25% (see Figure 7). Better prompting/training should push RLMs past this. Comparing RLMs to no-tool baselines? See our 🧵of tips x.com/sumeetrm/statu…


sorry it took me ~50 hrs! now i've got DSPy.RLM as SOTA on LongCOT (Full) by a very large margin, using... ...drumroll... Qwen 3.5 9B! 👑 Qwen3.5-9B + dspy.RLM = 15.69% on LongCoT-full 🔥 ~1.6× GPT 5.2's 9.83% on the same slice!






funny story, I've been trying to figure out the right shape for btca local for a while now if u haven't seen it, it's cli app that clones git repos u pass in then lets an agent search them. super super useful for getting better code out of agents what if it was a skill? why do I have to write code for: - cloning a repo - starting an agent - tools for the agent I already have a really good coding agent, just let it do all of that for me. It can clone the repo and do the search, and even contort itself into feeling like an app simply by telling it what it should be doing at different times Like if u invoke the skill with a "/" command and no args, it outputs what I would have had a custom tui write. Except I didn't write code I just told it what it's supposed to say if that happens I cannot believe gstack is what made this click for me but it is If u want to try the new version, it's so much better: npx skills add github.com/davis7dotsh/be… --skill btca-local

Lots of people interested in the late Interaction workshop, listening to @lateinteraction's keynote!





