@thsottiaux It’d be great if the app had an option to automatically open each new conversation in its own Git worktree. I was surprised a few parallel chats were all writing straight to main.
I've been thinking about what actually changes when AI makes local iteration cheap.
Your implementation team can fix something in an afternoon now. But if it still has to become a ticket, you’re just running the old system faster.
The move is letting them ship it - and making it easy for everyone else to reuse and improve what works. That’s when capability starts compounding.
john-rush.com/posts/compound…
*Agent-Streams: A Skeptical Overseer for Long-Running Coding Agents*
Overnight coding agents are fantastic — until “DONE” means a TODO, disabled tests, or a stubbed integration.
What’s worked for me: treat DONE as a claim, not a fact, and add a skeptical overseer with fresh context.
The loop:
- Spec is the trust boundary
- Builder implements and declares DONE
- Overseer reviews spec + diff, runs checks, and can only output: ISSUES.md or APPROVED
- Merge only on APPROVED (otherwise iterate)
This pattern has dramatically reduced false-done merges for long-running agents. You can check out my implementation of the pattern here:
john-rush.com/posts/agent-st…
@iocapon Some automation, but I’m still figuring out when that’s useful.
Typically I’m using goose and Claude code and writing markdown files to pass between them.
I stay in the terminal and with git I always have the history of the markdowns.
i really love this approach and it's something that i've been wanting to do for a few months now. i'm curious whether this stuff is automated or if you copy+paste out of different providers (go to chatgpt, then claude code etc..) or if you have managed to automate this another way
Current workflow: multiple claude code tabs, each on its own git worktree.
o3 clarifies & writes the plan, sonnet-4 implements, o3 + sonnet audit.
If something’s off I adjust the plan/prompt - not the generated code.
Disposable outputs, compounding inputs.
Link in thread
@felciano Yes, each macro unit of change. Basically a shippable unit of work I’d open a PR for
Our current code base is a large monorepo with 45M+ tokens so I’m not able to regenerate it in one go, yet.
Thanks for the write up. It sort of sounds like you rebuild the whole thing every time to find and “fix” a mistake by updating the specs (“inputs”). But presumably you don’t do that for the entire app/codebase, right? Do you do this cycle for each macro unit-of-change (e.g. new feature or enhancement)?