Mark Hendrickson

5.6K posts

Mark Hendrickson

@markymark

Your agents forget. I'm building the sovereign, deterministic state layer @ https://t.co/GqZHnbfSq4. Previously @LeatherBTC, @HiroSystems, @TechCrunch, @Crunchbase

Barcelona, Spain Katılım Kasım 2007

3.7K Takip Edilen6.3K Takipçiler

Sabitlenmiş Tweet

Mark Hendrickson@markymark·1 Nis

I've iterated further on the Neotoma homepage based on consistent feedback: the old site was too focused on architecture and not enough on my target market, its pain points, or its use cases. Lots of us are cobbling together agentic operating systems for our personal and professional lives, but on top of flat files, markdown docs, and JSON dumps that break between sessions. If your agents run autonomously across sessions and tools (Claude, Cursor, ChatGPT, OpenClaw, custom scripts and cron jobs), the new homepage leads with what breaks and how Neotoma fixes it across the agentic lifecycle (operating, building and debugging). ⤵️

English

431

Mark Hendrickson@markymark·3h

Everyone building multi-agent systems calls the shared substrate "memory." That framing is accurate as far as it goes. Memory is storage and retrieval: the system records what happened, and agents query when they need context. But memory is passive. It holds truth. It does not transmit awareness of changes in truth to the parts of the system that need to react. Agent A writes a new observation and Agent B does not know until it polls. The data is there. The awareness is not. A nervous system adds the transmission layer. After every write, the substrate emits a structured event describing what changed. Consumers subscribe and decide what to do. The substrate fires and forgets. The constraint is the feature. A state layer that signals can drift toward becoming an orchestrator with filtering, prioritizing, retrying, and routing. Each step sounds reasonable in isolation. Together they turn the substrate into something that makes decisions about what matters, and you lose the property that made it useful: behavior fully determined by the write, not by policy. Memory is judged by what it stores and whether you can query it back. A nervous system is judged by whether the rest of the system knows the moment truth changes. Those are different problems with different failure modes, and the second one is where multi-agent systems either scale or collapse. markmhendrickson.com/posts/from-mem…

English

Mark Hendrickson@markymark·2d

@marshallmixing w00t w00t!

Polski

BrandonMarshall.btc@marshallmixing·2d

My daily ritual for over 5 years: open Discord, open X, open Telegram, open Reddit, repeat. Interacting with the community → best part. Filtering signal from the noise and making that feedback actionable → worst part. Today, I'm fixing the worst part. Meet Vibewatch 💚

English

39.6K

Mark Hendrickson@markymark·2d

I've been a happy beta tester!

BrandonMarshall.btc@marshallmixing

English

127

Mark Hendrickson@markymark·3d

Final Part 5 of "The Human Inversion" series: markmhendrickson.com/posts/the-huma…

English

Mark Hendrickson@markymark·3d

Hybrid is probably your team shape, and it's the hardest one to get right when AI absorbs execution. Most teams (once one person can no longer handle the whole product alone) are not pure generalist or pure specialist. They're a staff engineer working deeply on infrastructure beside a surface PM doing market, design, and light architecture. Or a specialist designer working across the whole product beside generalist builders owning individual surfaces end-to-end. This is where the clean diagram meets messy reality. The gap between two generalists or two deep specialists is smaller than the gap between a specialist and a generalist. A staff engineer's constraint is written in language that a broad surface owner doesn't naturally parse. A designer's systematic rationale gets flattened when compressed into the cross-functional builder's working vocabulary. That is where silent constraint-drop gets even more expensive. If the translation layer drops one architectural commitment on the way to the surface owner, the owner often lacks enough depth to notice the omission. Conversely, the specialist often lacks enough surface context to see the downstream consequence. The work stays locally reasonable but drifts globally. That's why hybrid teams need artifact integrity more urgently than pure specialist or pure generalist teams. The architecture also applies well beyond standard software teams, though with calibration. Hardware timelines, regulated systems, bridge code, payment rails, and clinical tools do not negate the model. They partition it. Below the stakes line, you run higher agent autonomy and lighter review. Above it, you run constrained autonomy and dense, infrastructure-backed review. Partitioned trust isn't a weakness of the model. It's what an honest, tailored deployment looks like. Every staffing and tooling decision downstream of the inversion starts with three questions: 1. Is AI for the execution layer in your domain actually good enough today? 2. What team shape fits your product complexity: generalist, specialist, or hybrid? 3. Which surfaces carry a catastrophic blast radius and therefore need a tailored review posture? Not team size. Not funding stage. Not industry orthodoxy. Not competitor mimicry. The diagram is good. The transition is hard. If you had to draw your trust partitions today, where would the agent stop and the dense review begin?

English

Mark Hendrickson@markymark·4d

Part 4 of "The Human Inversion" series: markmhendrickson.com/posts/the-huma…

English

Mark Hendrickson@markymark·4d

Translation isn't adjudication. AI can summarize the architecture for the PM, the design system for the engineer, and the positioning for the designer. It can keep three excellent specialists fluent in each other's work without a single meeting. What it can't do is decide which one wins when the summary reveals all three are right inside their own domain and incoherent across them. A new feature lands. Positioning says it should be approachable for non-technical users. The design system says dense, information-rich patterns appropriate for power users. The architecture says the natural implementation is a configuration file, which is neither. Nobody made an error in their own domain. The tension lives between the domains. In the old org, this got worked out in a meeting. Someone arbitrated; the group iterated. Slow, but the resolution lived in the heads of the people in the room, not in any artifact. In the async parallel structure, the same tension can compound invisibly for weeks before anyone calls it a product problem. The architecture requires a role most org charts don't name explicitly: a reconciler. Often a senior cross-functional lead, often the founder. Their job isn't to attend more meetings. It is to maintain the rubric. The rubric is the explicit precedence order among competing disciplinary goods (e.g. approachable beats dense for this product, in these contexts, because of these commitments). It has to be specific enough to resolve the same trade-off the same way twice. It is not a values poster. It is not a strategy deck. It is the written version of trade-offs leaders have historically kept fuzzy on purpose. That fuzziness is the actual blocker. Fuzzy commitments let executives mean different things to different audiences and defer hard choices indefinitely. Rubrics require choosing, being specific about the choice, and accepting that the document now governs rather than your judgment in the moment. Most companies won't do this even when they fully understand it. The good news: nobody authors a rubric in a vacuum. It accumulates through adjudication. A single trade-off resolved is an expense. The same pattern recognized across three resolutions and codified is an investment that retires a class of future expenses. Constitution by case law. What ruins the loop is write integrity. If specialists make exceptions that never get formally reconciled, if multiple people edit against the same document without provenance, if six months of careful calibration gets quietly corrupted by two weeks of uncoordinated edits – then the rubric decays faster than the team can maintain it, and the structure collapses back into meeting-based coordination. What's a trade-off your team has resolved three different ways in three different cycles, because nobody codified it as precedent?

English

Mark Hendrickson@markymark·6 May

Part 3 of "The Human Inversion" series: markmhendrickson.com/posts/the-huma…

English

Mark Hendrickson@markymark·6 May

Every async experiment eventually re-invents the standup. Not because people love meetings. But because the PM's positioning doc was written for PMs, the design system was written for designers, and architecture docs were written for engineers. @jasonfried built an entire company philosophy around eliminating meetings. @shreyas has argued for years that the right meetings are the highest-leverage activity in product. They were both drawing on real evidence — yet both were working around the same constraint. None of these artifacts were built to be read across disciplines. The execution middle (e.g. spec reviews, design handoffs, cross-functional syncs) was where the translation happened. Live, in a room, expensive, and irreplaceable. You couldn't cut the meeting without cutting the translation. That's why Fried's approach required careful cultural engineering to make work, and why most teams that tried "fewer meetings" as policy drifted into incoherence. The coupling was real. The artifacts didn't carry the information that the meetings did. AI changes the coupling, not the aspiration. When AI sits between specialists as a translation layer, the engineer doesn't need to read the design system fluently; they ask their agent what it implies for the component they're building. The PM doesn't parse architecture docs; they ask theirs whether their proposed feature violates a commitment. The judgment stays with the specialist. The translation moves to a layer that doesn't require a calendar invite. This has a failure mode. AI translation can silently drop constraints. When it summarizes architecture for the PM without a critical technical commitment that rules out their feature, the work stays deep but drifts out of alignment. Nobody notices for weeks. Faithful summaries and durable artifact integrity are the difference between coherence and invisible drift. Without them, the meeting comes back – or worse, the incoherence ships. What can actually go away now: stand-ups, hand-offs, spec reviews, design reviews, cross-functional syncs, and most status meetings. Not from a calendar policy. From the coupling that justified them dissolving. What stays are: – Strategy meetings, less frequent and more prepared – Trust formation between people who haven't built mutual models of each other's judgment – Novel decisions that have no precedent to resolve against. That's probably 70-90% of current operational meeting volume, gone because the work moved. The teams still running heavy meeting schedules in 2027 won't be teams that skipped AI. They'll be teams that bolted AI onto the old org chart and left the coordination layer untouched. The question isn't how many meetings you've canceled. It's whether the work still needs them.

English

120

Mark Hendrickson@markymark·6 May

I gave @Plaid's CLI a spin. But I still can't use it without going through a multi-week business application with manual forms, no transparency, and a human-only process. Open finance that requires a business application isn't open. And it definitely isn't agent-ready.

Mark Hendrickson@markymark

This will be a critical tool for open finance if end users can access fully without going through the Plaid application process as "developers". Going to give it a spin myself soon.

English

191

Mark Hendrickson@markymark·6 May

Agreed on the PMF distinction. On AI's impact, your recent writing describes headcount flattening even as revenue scales, and engineers shifting from craft to orchestration. I think those are symptoms of the same structural shift I'm writing about here. When execution is picked up by AI, the human bottleneck becomes judgment attention on AI's inputs and outputs (not backlog anymore). That changes what the hiring trigger actually is. Not "too much work" but "one person can no longer sustain quality across foundations, review, and strategic calls."

English

Elad Gil@eladgil·5 May

I dont think there was a debate on hire fast vs hire slow - I think issue was cos without product/market fit were encouraged to hire (when they shouldnt) & companies with product/market were told not to hire when they should. The company Sam runs now has >1000 people Separate from that, AI & agents does shift hiring and how to think about who to hire when and why

English

6.8K

Mark Hendrickson@markymark·5 May

"Hire slow" versus "hire ahead of the curve." Startup hiring advice has run in two directions for over a decade. @sama's YC Playbook opens with "my first piece of advice about hiring is don't do it." @eladgil's scaling playbook pushes the opposite: hire ahead, because under-staffing compounds. Both assume hiring is a response to execution demand. The question was only about timing. That assumption was correct when execution was expensive. One more engineer roughly doubled throughput. Coordination overhead was a tax on the gains, not an elimination of them. The "slow-hiring" camp said the tax was bigger than you thought. The "ahead-of-the-curve" one said the tax was smaller than the cost of under-staffing. When execution collapses to AI, the underlying question changes. Adding a second engineer doesn't double output, because the engineer you have isn't bottlenecked on execution. They're bottlenecked on the human inputs to execution: foundational artifacts, architectural judgments, review of what AI produced, and strategic calls about what to build next. A second engineer doesn't parallelize those inputs. They introduce coordination cost on judgment calls one person was making unilaterally and fine. The hiring trigger is no longer "too much execution work." It is: the attention budget of the current team has been exhausted on AI's inputs and AI's outputs. That is a specific moment. It is when the single human driving a product can no longer give adequate attention to three loads at once: 1. Authoring foundational artifacts with enough care that AI executes well. 2. Reviewing AI's output with enough density that quality holds. 3. Making the strategic calls that determine what gets built next. When any one of those three starts getting neglected, you're at the ceiling. The neglect shows up before the throughput drops, which is why teams miss it. It looks like the founder skimming AI output instead of reviewing carefully. The PM reusing old interview notes instead of doing fresh research. The engineer letting architectural drift accumulate because writing the constraint doc properly would take a week they don't have. None of these produce immediate failures. Features still ship. Users still use them. But compound quality starts degrading, and the degradation is invisible for months. Every hire before the attention ceiling is friction without leverage. How is your team actually deciding when to add people – based on backlog or attention?

English

7.1K

Mark Hendrickson@markymark·5 May

Part II of The Human Inversion series is about why the slow-vs-fast hiring debate is asking the wrong question, what the attention ceiling looks like in practice, and the four objections this reframe needs to survive: markmhendrickson.com/posts/the-huma…

English

113

Mark Hendrickson@markymark·4 May

@heyhve_ State is key. Not sure if you've seen my project (Neotoma) yet, but it's all about providing proper state for agentic memory. Code review is a critical use case, of course – happy to explore how a persistent state layer could serve Tenki, if helpful

English

hve 🍁@heyhve_·1 May

@markymark Same instinct, I'm building Tenki against this exact bet. Premise is that review breaks down because it's stateless. Early days but the directional read feels right.

English

Mark Hendrickson@markymark·29 Nis

Software teams org'ed around execution for decades. PMs wrote specs, designers into mockups, engineers into code. Pre-exec foundational work and post-exec review got whatever time was left over. Then AI changed the economics. The "middle" is where models are genuinely good now. Execution cost collapsed. So humans move to the ends. Richer positioning, deeper architecture, and real design systems on one side; comprehensive review on the other. The ends were always where the most leverage lived; we just couldn't afford to staff them. But removing the human middle also removes the implicit translation that kept disciplines in contact with each other's reality. Coherence has to come from somewhere else now. I've written a 5-part series on what that all means, exploring: – Attention scarcity – Hiring triggers – Async coordination – Who and what adjudicates cross-domain conflict – Data integrity at AI-generated volume – Where the clean diagram breaks against real teams All starting with why the shift is economically forced, and what it feels like for the specialists at the center of it.

English

103

Mark Hendrickson@markymark·1 May

@heyhve_ Yea and review will continue to absorb human bandwidth now until we direct more of our attention to foundational materials so agents get things right the first time around. That foundational work should also make it possible for agents to review agent PRs more effectively, too

English

hve 🍁@heyhve_·30 Nis

@markymark Right. And review is the new middle. Used to be the bottleneck before merge. With agents writing PRs, it's the bottleneck before the bottleneck.

English

Mark Hendrickson@markymark·29 Nis

Part 1: The Inversion markmhendrickson.com/posts/the-huma…

English

Mark Hendrickson@markymark·29 Nis

This will be a critical tool for open finance if end users can access fully without going through the Plaid application process as "developers". Going to give it a spin myself soon.

Plaid@Plaid

We built a CLI so you can do this: plaid transactions list --json \ | claude -p "how much have I spent on eating out?" Your real data, 4 commands away. No sandbox data, no SDK, up and running in minutes. brew install plaid/plaid-cli/plaid Read more here: medium.com/plaid-engineer…

English

570

Mark Hendrickson@markymark·29 Nis

@ebloch 100%

Ethan Bloch@ebloch·29 Nis

Plaid shipping a CLI feels small on the surface, but I think it unlocks a pretty important new pattern. For a long time, the gap between “I want to understand my financial life exactly my way” and “I can actually build that system” was huge. You either used a consumer finance app with someone else’s categories and workflows, or you tried to maintain a spreadsheet manually. A Plaid CLI changes that. Now a small business owner, operator, or spreadsheet-heavy prosumer can connect their accounts, pull transaction data, and use tools like Codex and even @openclaw to turn that into a bespoke financial operating system. Not a generic PFM. Not another dashboard you have to adapt to. Something that matches how you actually think: - your categories - your cash flow model - your business rules - your household quirks - your spreadsheet - your questions It’s probably still a couple steps too technical for most consumers today. But for the kind of person already running their financial life through complex spreadsheets, this is a big unlock.

Plaid@Plaid

English

618

Mark Hendrickson@markymark·29 Nis

Full piece: the chronological case, what the pivot lost, and the version of the argument Cagan's own archive already supports. markmhendrickson.com/posts/the-argu…

English

Mark Hendrickson@markymark·29 Nis

Marty Cagan spent 2023 building the strongest case for PM value in a decade: four knowledge domains, deep stakeholder relationships, and integrative judgment no specialist can replicate. Then he spent 2025-2026 redefining the PM as a prototyper, which is the one skill AI just commoditized. Two articles published in May 2025 established opposite cases. One says "the PM role becomes more essential," while the other says anyone with so-called "product sense" can fill it. They were published on the same site, only weeks apart. The argument he could have made was already in his own archive: when building gets cheap, building the wrong thing becomes the dominant cost. Integration failures are cross-dimensional: compliance nightmares, broken unit economics, and building something that requires a sales motion your team doesn't have. No prototype surfaces them alone. He had the standing and the audience to say that the coordinator PM is finally free, and now she becomes the integrator she was always supposed to be. Instead, the loudest message became "everyone is a builder now." That is true, but it leaves the harder question unanswered: who holds the full picture when everyone is building?

English

128

Keşfet

@marshallmixing @jasonfried @shreyas @Plaid @sama @eladgil @heyhve_ @elonmusk