Stephen Martin

833 posts

Stephen Martin

@martintechlabs

We Build AI That Drives Outcomes https://t.co/jRO9yT3jd4

USA เข้าร่วม Mart 2009

511 กำลังติดตาม1.5K ผู้ติดตาม

Stephen Martin@martintechlabs·2d

@warrioraashuu Huh? Founders have outsourced coding forever...

English

aashuu ✦@warrioraashuu·2d

If AI wrote almost all your code, what actually makes you the founder?

English

2.7K

Stephen Martin@martintechlabs·2d

@petergyang Isn't it great? Dropping out of corporate is so freeing.

English

Peter Yang@petergyang·2d

My calendar is pretty great now ngl. This is what PMs dream of.

English

145

20K

Stephen Martin@martintechlabs·2d

@MatthewBerman 90%+ of knowledge work jobs are just loops. That's why mangers have so many reoccurring meeting.

English

Matthew Berman@MatthewBerman·2d

loops but for all knowledge work

English

142

18K

Stephen Martin@martintechlabs·2d

@inferencegod Gotcha. Makes sense now. Thanks for the details!

English

Ethereal@inferencegod·2d

yeah you’re right, subagents do get a genuinely fresh context, isolated tools, even worktree isolation and their own hooks. i was drawing the line in the wrong place. the distinction i actually mean is single-session vs separate sessions. a subagent, even a fork, is spawned and judged by the same parent agent in one run. mine are two independent claude processes with no shared parent deciding the verdict, they only see committed git state. the docs kind of point at this too, they send you to agent teams or background agents for cross-session stuff rather than subagents. honestly for a lot of setups subagents would do the job. i went heavier because i wanted the reviewer to be a process the builder cannot influence at all. thank you for the context!

English

112

Ethereal@inferencegod·2d

i don't feed my agent tasks anymore. when the backlog runs dry, it researches and invents the next feature itself, then builds it. and it polices its own work before i ever see it. autonomy-loop v0.5.1: → self-feeding: empty backlog? it proposes the next feature and keeps going, no prompt from me → the bite: it reverts its own fix and reruns the test. stays green? it caught nothing, rejected → self-mutation: it mutates its own changed lines so weak tests get caught before handoff → circuit breaker: it parks to me instead of looping forever → branch protection: it can never touch prod or edit away its own gates → upgrading is one command: /autonomy-upgrade → red-teamed, 77 tests green two terminals. a builder, and a reviewer that trusts nothing. one repo. nobody driving. free, mit, 151 people already running it. /plugin marketplace add github.com/inferencegod/a… /plugin install autonomy-loop@autonomy-loop

English

9.6K

Stephen Martin@martintechlabs·2d

@inferencegod Oh, that is a different understanding than what I had. I thought the whole point of subagents was to have a fresh context. ref: code.claude.com/docs/en/sub-ag…

English

127

Ethereal@inferencegod·2d

subagents share the parent’s context and run inside the same session, so the reviewer is still kind of grading its own homework. two terminals are two independent claude processes that can’t see each other’s reasoning, only the committed git state. the reviewer re-runs the gate from scratch and reverts the builder’s fix to confirm the test catches it. the separation is the point. you can’t red-team yourself in the same context window. it also means a crash in one doesn’t take the other down, and the whole handoff is just git. hope this helps

English

353

Stephen Martin@martintechlabs·2d

@inferencegod love the idea, am just curious

English

Stephen Martin@martintechlabs·2d

@inferencegod Why not just use subagents?

English

350

Stephen Martin@martintechlabs·2d

@petergyang I find this can happen if you don't have a doc or a spec that it can use to check off and say something is done or explored. Are you using something like that?

English

116

Peter Yang@petergyang·2d

So I have Codex running on a /goal and it's been working for 2 hours but the problem is it's making alot of wrong assumptions so I have to monitor and steer it constantly. Is this expected? Perhaps I should've had it make a detailed plan first?

English

237

474

123.1K

Stephen Martin@martintechlabs·9 Haz

@mattpocockuk But who will process all of your queues? Loops, of course!

English

295

Matt Pocock@mattpocockuk·9 Haz

Everyone's banging on about loops When they should be thinking about queues

English

185

1.4K

327.5K

Stephen Martin@martintechlabs·3 Haz

@mattpocockuk Just wait until they learn about XML!

English

921

Matt Pocock@mattpocockuk·2 Haz

Christ, I can't wait for the HTML > Markdown debate to die Some people really just read the title of the article, eh

English

533

83K

Stephen Martin@martintechlabs·27 May

@Leohuynh57 So much more to being a founder than code...

English

Leo Huynh@Leohuynh57·27 May

Hey founders: Do you ever feel shameless calling yourself a founder when AI wrote most of your code?

English

6.1K

Stephen Martin@martintechlabs·26 May

I strongly agree that "AI-generated writing" will be more common. I almost treat it like an interface now. If I know that I'm speaking with a developer, then I'll create a doc that an AI can ingest and implement. If I know the document is going to be consumed by humans, I make sure it's thorough and can tailor it to their perspective. With AI, it can tailor content to many people's perspectives, almost like translating to a different language.

English

230

Lenny Rachitsky@lennysan·25 May

My biggest takeaways from @danshipper: 1. The future of work will happen inside Codex or Claude Code. Instead of putting AI into your SaaS tool, you’ll use your SaaS tools inside your favorite AI agents' in-app browser. Dan spends all his time in Codex now—writing documents, managing email, doing research, everything. He's using Google Docs, PostHog, and everything he needs within the agent's in-app browser. The agent can see what he’s doing, and has all of his context, so he and his agent collaborate quickly and super effectively. 2. Automation is a lie—every automation needs a human. Dan's company doubled in size this year despite being incredibly AI-forward. Why? Because in order to make automation work well, you need humans making sure everything keeps working. This is why benchmarks are misleading—they measure AI on problems we’ve already framed and can score, but there’s always a higher frame. 3. PMs will win the AI era. Marcus, a former PM who previously ran Axios’s writing product, joined Every after getting super AI-pilled. Now he runs their product Spiral, and ships faster than anyone on the team. He pairs technical knowledge with spiky product sense, deep user empathy, and an eye for what matters. Dan thinks any PM who gets really AI-native will be incredibly dangerous because the building is done for you—what matters is figuring out what to build and if it’s great. 4. Full-stack designers are becoming superheroes. Designers used to make beautiful interactions that engineers didn’t want to build or couldn’t execute properly. Now designers don’t need to hand things off; they can build it themselves. Designers are naturally creative people, and AI is the perfect tool for them because it lets them bring their vision to life without the traditional bottlenecks. 5. SaaS is not dead. In fact, Dan is bullish on SaaS stocks. When users bring their own AI (via Codex or Claude Code) to use SaaS products, the user—not the SaaS company—pays for tokens. This saves SaaS company’s margins. Since the agents need their own seats, Dan predicts that agents will create massive new demand for SaaS because there will be tons of agents using these products at high volume. 6. Every company will have one “super-agent” inside their Slack that every employee will use. Dan initially thought every employee would have their personal work agent, like a shadow AI org chart, but he’s completely flipped his view. He realized agents need humans who care about them. When someone gets tired of maintaining their personal agent, it becomes useless. The winning model is one forward-deployed engineer or AI-savvy person who maintains a company-wide agent (like Shopify’s River or Viktor), and then it trickles down to more specialized team agents as models improve and become less fiddly. 7. The AI job apocalypse is not happening, but you do need to evolve to stay relevant. Models make yesterday’s human competence cheap. But because everyone uses the same models, it all looks the same if you use it the default way; it becomes commoditized slop. Humans then take that frozen competence and use it to make something new and interesting for their specific situation. The key: “ride the models”—use them for everything you do, try new models when they drop, keep turning over rocks. 8. We will read way more AI-generated writing, and we will like it. Human writing is incredibly important for things that matter, but for internal docs, planning, and email, AI-generated is often better because most people are bad at writing strategy documents. 9. Build software for humans and agents to use together. The current model is building a CLI that an agent uses independently. Instead, you and your agent should be using the app together. This creates new design challenges—agents can make a billion requests in three seconds, so you need approval flows, inboxes that summarize what happened, logs, and easy rollback. 10. Forward-deployed engineers are the new most essential role. The big model companies have teams of people managing their internal agents, and those teams aren’t going away. It’s different from traditional software building, and certain engineers love it. As models get better, this role will evolve—you’ll be managing more agents doing more things.

Lenny Rachitsky@lennysan

Automation is a lie. CLIs are over. The SaaSpocalypse is dumb. A year ago @danshipper came on the podcast to predict where AI was heading. He was remarkably right—including the call that everyone was sleeping on Claude Code. Dan has a unique lens into where things are going because his team at @every is possibly the most AI-pilled group of people in tech. I always learn a ton talking to Dan. So I brought him back for round two. We'll score these in exactly a year: 🔸 Every company will have one “super-agent” in Slack. 🔸 Codex and Claude Code will become the new operating system for knowledge work. 🔸 The AI job apocalypse is not happening. 🔸 PMs and designers will thrive. 🔸 We will read way more AI-generated writing and we will like it. 🔸 "I would buy SaaS stocks right now." Listen now 👇 youtube.com/watch?v=4D3hDm…

English

152

238

2.1K

762.7K

Stephen Martin@martintechlabs·18 May

Built with gpt-realtime-2 this weekend and it felt like pair programming with a human. Instant responses, real-time iteration, no context switching. The gap between idea and execution just collapsed. This is the future of coding.

English

116

Stephen Martin@martintechlabs·16 May

Wow! The CR tool my agents needed.

Peter Steinberger 🦞@steipete

Try clawpatch.ai on one of your repos and let codex work its magic. It's amazing at uncovering bugs you didn't know you had.

English

Stephen Martin@martintechlabs·16 May

@sarahfim Have been following @composio for a while. Love the work you all are doing.

English

sarah@sarahfim·16 May

x.com/i/article/2053…

ZXX

110

10K

Stephen Martin@martintechlabs·14 May

@skylar_enns_ @KaiXCreator I hear you. That's why I use tools like @conductor_build and @emdashsh to give me a lot more versatility.

English

Skylar Enns@skylar_enns_·14 May

@martintechlabs @KaiXCreator True only from a model standpoint. As someone who uses t3code as the interface instead of the claude app/cli directly it's just a kick in the nuts.

English

Stephen Martin@martintechlabs·14 May

@1Umairshaikh It's about the process, not the tool. - Break things down into small pieces - Create a plan before coding - Code-review with a different agent - Make sure you have tests/validators - Bonus: run weekly scans for general code clarity, test coverage, security, and architecture

English

Umair Shaikh@1Umairshaikh·14 May

Vibe coders, which one actually ships the best code for you? - claude - cursor - codex

English

1.7K

Stephen Martin@martintechlabs·13 May

The workflow is the asset. Not the current model. Not the wrapper. Not the demo. If one vendor change breaks the whole thing, you built on the wrong layer. Portable workflows age better than model-specific hacks.

English

Stephen Martin@martintechlabs·12 May

Human review is not a failure mode in AI automation. It is the design. Good first workflows: - draft - classify - route - surface the exception Let the system do the repeatable part. Keep judgment with the team.

English

ค้นพบ

@warrioraashuu @petergyang @MatthewBerman @inferencegod @mattpocockuk @Leohuynh57 @elonmusk @BarackObama