Dan Grafham

38 posts

Dan Grafham

Dan Grafham

@dangrafham

Building the AI dark factory. Field notes ↓

انضم Mayıs 2026
33 يتبع12 المتابعون
تغريدة مثبتة
Dan Grafham
Dan Grafham@dangrafham·
I could use a hand. I've built a governed autonomous software factory from scratch in Go. It is live on Hetzner across 8 private repos, with specialist agents, supervisors, isolated worktrees, lineage, cost tracking, replay, and doctrine boundaries. I'm now pushing on the 100 to 1000 agent problem: how one human operates a software factory at that scale without losing governance, cost control, or authority. I'm personally funding it, and the whole system is still running for less than $1,000/month including infrastructure. But API limits and personal runway are becoming the bottleneck. I'm looking for enough AI-compute runway to keep pushing this properly: credits, sponsored access, subscriptions, a small bridge, or an intro to somebody who can help. Happy to show the cockpit live. DMs open. dangrafham.com
Dan Grafham tweet media
English
2
1
1
174
Dan Grafham
Dan Grafham@dangrafham·
Tests are tools. Evals are contracts. Reality is the final approver. The moment you go autonomous, every verification ritual you inherited has to earn its place again. Most of it is just instrumentation. Deploy is not proof. dangrafham.com/real-world-mus…
English
0
0
0
33
Dan Grafham
Dan Grafham@dangrafham·
@JustJake The loops are the easy part to explain. The hard part starts when the loops spawn more loops. Then the question is how does one person maintain lineage, cost control, authority, and situational awareness without drowning in the details? That's the layer I've been building.
Dan Grafham tweet media
English
0
0
1
300
Jake
Jake@JustJake·
The problem with this, and why I think people are frustrated: Nobody has taught folks how to do this It feels both evidently the future and also somehow gatekept
English
98
17
488
311.3K
Dan Grafham
Dan Grafham@dangrafham·
Everyone is racing to tear the cage off their agents. Right call. But two things are wearing the same costume. A cage distrusts the models thinking (tests, validators, retry loops you wrote because you didn't trust it). Rip those out. A constitution bounds its authority over what you can't undo: a landing gate nobody can bypass, or a receipt for every irreversible action. Those you keep, and they matter more as the agent gets smarter, not less. Tear out the cage. Keep the constitution. Telling them apart is the whole game. dangrafham.com/cages-and-cons…
English
0
0
0
65
Dan Grafham
Dan Grafham@dangrafham·
tmux is absolutely enough for six agents. The interesting part starts when "how many can I hold in my head at once?" becomes the bottleneck. At 100–1000 agents, orchestration is not opening more panes. It is lineage, budgets, authority, containment, replay, and deciding which three things need a human right now. Simple is good. But simple also has a scale limit.
English
1
0
9
2.1K
Sudo su
Sudo su@sudoingX·
if you're just getting into agentic workflows, the orchestration part is way simpler than the people selling you platforms and courses want you to believe. here's the entire thing, and once you see it you can't unsee how simple it is. you open an agent in every tmux pane you want working for you. you give each one sandboxed skip-permission so it can actually move without you approving every keystroke. then you spin up one more agent as the main one and you tell it plainly, these other panes are all working with us, your job is to look at them and delegate. that's the whole orchestration. no framework. no dashboard. no per-seat subscription. the only real limit is how many of them you can hold in your head at once. i run six. each of those six has its own subagents under it, all working on my actual product right now while i type this. it sounds insane until you watch it run, then it just looks obvious. they overcomplicate this on purpose, because simple is hard to put a price tag on. you don't need their course or their dashboard. you need a terminal and the nerve to let the agents work.
English
40
22
506
32.5K
Dan Grafham
Dan Grafham@dangrafham·
"Stateful" is hiding three different things: 1. Persistent Workspace The filesystem survives long enough to resume work later. Processes may die, in-memory state may vanish. The environment may be reconstructed from a preserved workspace, a persistent drive, or restored filesystem after teardown. 2. Snapshot-able Sandbox The environment can be paused, snapshotted, restored, forked, or resumed. Create sandbox, snapshot, restore later. 3. Durable Computer The sandbox has a stable identity. It sleeps, wakes, accumulates tools and state, and remains addressable across conversations. Persistence is not a trick, it is the core model.
English
0
0
1
32
Tereza Tizkova
Tereza Tizkova@tereza_tizkova·
@gm_mertd how do you define stateful? im seeing it as persistent, just being able to go back to where it was, resume, etc
English
1
0
1
106
Tereza Tizkova
Tereza Tizkova@tereza_tizkova·
made overview of sandbox providers, but only from official sources (docs, web) github.com/tizkovatereza/… There is a lot of slop and 100% false claims circulating about sandboxes today. I dont like slop. this took me 5 mins and three comments to my Droid, and is open to PRs if you have more accurate info. goal is truth.
Tereza Tizkova tweet media
English
19
10
94
30K
Anthropic
Anthropic@AnthropicAI·
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
English
1.8K
4.7K
28.6K
18.5M
Dan Grafham
Dan Grafham@dangrafham·
@mattpocockuk COMPACTION = CORRUPTION. Sometimes necessary, never free. The failure mode is when the secondary source quietly becomes the authority and nobody remembers what was discarded.
English
0
0
0
30
Matt Pocock
Matt Pocock@mattpocockuk·
A context engineering metaphor I've been playing around with: - Primary source: the source of truth. Raw data. Transcripts. Code. - Secondary source: one step removed. Summaries. Compactions. Documentation. For instance, compaction takes a primary source (the conversation history) and turns it into a secondary source (the summary). This is lossy, but means the secondary source can fit into a smaller space. If you want to know what your codebase does, your code is a primary source. Your docs are a secondary source. Loading primary sources into context is expensive, but provides richer context. Secondary sources are cheaper to load into context, but may be information-lossy. Any context engineering will involve managing the tradeoffs between both.
English
46
20
473
30.6K
DegenApeDev
DegenApeDev@DegenApeDev·
Are you Canadian and into AI? Lets see what you're building here 👇 Builder Share Thread 🧵
English
68
4
90
13.3K
Dan Grafham
Dan Grafham@dangrafham·
The red boundary breach is not decoration. It is the system catching a rule violation in real time, isolating the affected worktree, and surfacing operator controls to acknowledge the breach or retry in a fresh sandbox. The goal is not maximum autonomy at any cost. It is governed autonomy that remains inspectable, constrained, and operable as the factory scales. It's factories all the way down.
Dan Grafham tweet media
English
0
0
1
48
Dan Grafham
Dan Grafham@dangrafham·
The interesting problem is no longer "can agents write code?" It is: How does one human operate hundreds of concurrent software agents without losing lineage, governance, cost control, or authority? That is why the runtime has supervisors for decision, judgment, and autonomic control. Not just more agents blindly spawning more agents.
Dan Grafham tweet media
English
1
0
1
63
Dan Grafham
Dan Grafham@dangrafham·
I could use a hand. I've built a governed autonomous software factory from scratch in Go. It is live on Hetzner across 8 private repos, with specialist agents, supervisors, isolated worktrees, lineage, cost tracking, replay, and doctrine boundaries. I'm now pushing on the 100 to 1000 agent problem: how one human operates a software factory at that scale without losing governance, cost control, or authority. I'm personally funding it, and the whole system is still running for less than $1,000/month including infrastructure. But API limits and personal runway are becoming the bottleneck. I'm looking for enough AI-compute runway to keep pushing this properly: credits, sponsored access, subscriptions, a small bridge, or an intro to somebody who can help. Happy to show the cockpit live. DMs open. dangrafham.com
Dan Grafham tweet media
English
2
1
1
174
Dan Grafham
Dan Grafham@dangrafham·
6/ What I still own: I write the doctrine the agents run under, I decide what gets built, and I firefight the failures the loop cannot route around yet. What landed today: off-registry sovereignty and the consume cutover, admission-gate hardening with quota and spend-velocity stops, canonical-repo unification with CI enforcement, chain-quality enforcement, fleet-wide required checks emitting concrete pass or fail, an epic-tier decomposer. Field notes: dangrafham.com Terms: dangrafham.com/glossary
English
0
0
1
43
Dan Grafham
Dan Grafham@dangrafham·
5/ I am none of those identities. I cannot push a merge through by hand, because admin enforcement applies to me too. A salvage path exists and is tracked, so I will not claim nothing ever goes around it. But the gate is real, independent, and the operator cannot bypass it.
English
1
0
1
51
Dan Grafham
Dan Grafham@dangrafham·
1/ There is an argument going around this week about whether capable agents still need to be wrapped in control code, or whether the wrapper is just friction we should drop. I have been running one inside a real, enforced gate. Here is day 2.
English
1
0
1
105
Leonie
Leonie@helloiamleonie·
just curious: what’s the most useful thing your OpenClaw, Hermes Agent, etc. is doing for you?
English
366
65
1.4K
323.7K