John Davenport

41 posts

John Davenport

John Davenport

@johns10d

Toronto, Ontario, Canada เข้าร่วม Ekim 2016
49 กำลังติดตาม6 ผู้ติดตาม
John Davenport
John Davenport@johns10d·
@sam_hatoum I was playing with a complex generated feature. Vega lite manual editor, llm chat that modifies the json document, and a rendering of the chart. Trying to work at the bdd spec level to produce a working ui. Every iteration the model shortcuts around making it fully work.
English
0
0
0
7
Sam Hatoum
Sam Hatoum@sam_hatoum·
Everyone fixates on "test" in Test-Driven Development. The important word is "driven." TDD was never a testing technique. It was a design technique — recording decisions as executable expectations. Post 8 of 20 in The Spec-Driven Shift series. specdriven.com/perspectives/t… Repost from @beonauto.
Sam Hatoum tweet media
English
1
0
1
31
John Davenport
John Davenport@johns10d·
@automate_archit @anitakirkovska It’s not complicated. I sat with the agent to make a strategy. I do the strategy every day. I change it if I find something good. I add a tool if I need it. That’s it.
English
0
0
0
2
anita
anita@anitakirkovska·
Marketers with Claude Code who think like engineers are about to print money.
English
59
40
864
32.4K
Shades of Samsara
Shades of Samsara@ShadesofSamsara·
Yesterday I started using the remainder of my claude-code max5 subscription to create my own harness by copying most of what OpenClaw does and blending it with my own workflow dashboard I built for myself and my business. So far its working by building it all on top of CLI -p commands. Using my subscription with their tools, to follow their rules, to make my own wrapper that does what I want it to do. So far so good, just waiting on current session to reset to keep debugging some of the feature issues from the fresh build.
English
1
0
1
240
Melvyn • Builder
Melvyn • Builder@melvynx·
Day 3 with OpenClaw: In all my tests, GPT 5.4 is consistently the worst model for agentic tasks. Lazy, stupid, never follows anything, feels like you are a baby sitter. I don't know how OpenAI manages to make such a shitty model but this feels terrible. I miss Opus.
Melvyn • Builder tweet media
English
187
8
392
39.6K
John Davenport
John Davenport@johns10d·
@WeberBuilds But Anthropic didn't build a harness. They built an environment for a harness. You still have to write your stop hook and what it's going to do when the stop hook fires. You still have to write your skills and progressive disclosure logic.
English
0
0
0
1
Michael Weber
Michael Weber@WeberBuilds·
Claude Managed Agents (probably) ended the "build your own harness vs framework" debate yesterday. Building and running your own custom multi-agent system is now basically "why am I doing this?" Scary times. What's the next layer to collapse?
English
1
0
0
18
John Davenport
John Davenport@johns10d·
@Trader_XO @trader1sz Specs are an effective way to plot and execute medium to long horizon development tasks, especially when you want to exert control over low-level engineering decisions. I've come to understand it's just part of the puzzle and may not always be applicable.
English
0
0
0
10
XO
XO@Trader_XO·
@trader1sz A solid win today: I get Claude Code to generate my specs and tasks. Upon completion it automatically launches a [codex exec] workflow to spin up Codex agents for the tests and implementation...
English
24
1
105
14.8K
Ken
Ken@ks458008·
@johns10d @everythingLLM Exactly my point. Hooks, test gates, verification loops — none of that is "just prompting." When the discipline requires procedural thinking, we need a new name for it.
English
1
0
1
14
Ken
Ken@ks458008·
Prompt Engineering is over. The next era is Harness Engineering.
English
1
0
1
42
John Davenport
John Davenport@johns10d·
@NotRiteQuite @virtualunc @romxdev ? @NotRiteQuite im a software engineer writing a procedural harness. I don’t know if your mother ever taught you “if you don’t have anything nice to say, don’t say anything at all.” If you’re going to talk shit in the internet, know who you’re talking to.
English
0
0
1
17
Not Right
Not Right@NotRiteQuite·
@johns10d @virtualunc @romxdev That's a waste of context, you should just learn to write code and use code to constrain. I don't know how so many of you are missing this. You got a machine to automatically write code and you're writing it a diary. Morons.
English
1
0
0
22
Roman
Roman@romxdev·
vibe coding is officially dead I had to say it. we thought AI would let us relax and code "on chill", but instead it turned us into architectural bureaucrats. we write strict laws, define rules, limits, and principles. if you don't obsessively review the code agent writes, your project will mutate into a massive landfill of tech debt within a month.
English
319
189
2.3K
287.2K
John Davenport
John Davenport@johns10d·
@sam_hatoum I was playing with a complex generated feature. Vega lite manual editor, llm chat that modifies the spec, and a rendering of the chart. Trying to work at the spec level to produce a working ui. Every iteration the model shortcuts around making it fully work.
English
0
0
0
4
Sam Hatoum
Sam Hatoum@sam_hatoum·
@johns10d Me too, executable specs make all the difference. It was bad enough with humans causing regressions, AI is atrocious at it!
English
1
0
0
8
John Davenport
John Davenport@johns10d·
@dbmarkley I can tell you how I do it. I wrote my harness in elixir. It lands as a web server that’s compiled into a binary and sits inside the Claude plugin. Partly because I’m wary about showing my guts to the anthropic(s)
English
0
0
0
8
David Markley
David Markley@dbmarkley·
💯 The stack I built is specific to how I operate as a PM running multiple projects. The orchestrator, knowledge architecture, and review pipeline all emerged from my specific constraints. Anthropic's Managed Agents will handle the generic infrastructure better than I ever could. The question I'm most interested in is: what are the domain-specific patterns that sit on top of that infrastructure? The committee design, the verification fidelity spectrum, the knowledge routing. Unclear how we share those templates at scale as I don't think they get commoditized by Anthropic/OpenAI
English
1
0
0
9
John Davenport
John Davenport@johns10d·
@AbdelStark @AnthropicAI My harness currently uses bdd tests, specs, unit tests and code. I want to experiment with cutting specs and unit tests out.
English
0
0
1
21
abdel
abdel@AbdelStark·
Introducing claude-md-compiler: Compile structured Claude Code workflow policy into versioned artifacts and enforce it against runtime evidence, hooks, and git diffs. What is it about ? Basically the problem statement comes naturally from @AnthropicAI documentation about how Claude Code treats your Claude.md file: "Claude treats them as context, not enforced configuration." This is the gap I tried to close with this project. To enable creating actual policies that can be enforced and deterministic. I think it can be very useful for CI checks and strong guarantees, to make sure the harness will enforce some invariants and rules, instead of depending on the LLM inference, that can tend to hallucination and skip the boundaries / rules / invariants. Repo is open source: github.com/AbdelStark/cla…
English
10
7
34
2.5K
John Davenport
John Davenport@johns10d·
@RLanceMartin This is a great feature that's going to help a lot of people build out harnesses for accomplishing really complex tasks.
English
0
0
0
121
John Davenport
John Davenport@johns10d·
@dejno Managed Agents is really just part of the solution though. Maybe people smarter than me will figure out how to make really effective general purpose harnesses, but I've found that being very specific leads to more effective outcomes.
English
0
0
0
9
Jake Dejno
Jake Dejno@dejno·
Three things I believe about agents in prod: - The agent harness matters as much as the model. - Orchestration shouldn't be every team's problem. - Teams that win spend their time on product, not infra. Claude Managed Agents is our answer to all three, and now it’s in Public beta
Claude@claudeai

Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.

English
1
0
1
124
John Davenport
John Davenport@johns10d·
@ashmaurya In business if you start blaming the person that's a problem. You blame the process. The vibe coding crisis isn't about the model. It's about the process, and you can only improve that through experimentation.
English
0
0
0
6
Ash Maurya
Ash Maurya@ashmaurya·
The vibe coding crisis isn't about code quality. Everyone's debating bugs, security flaws, "worst software crisis" headlines. The real crisis: non-technical founders can now ship bad ideas at unprecedented speed. Faster failure is still failure. The fix isn't better AI. It's better experiments.
English
5
0
4
297
John Davenport
John Davenport@johns10d·
@JWallaceParker If you start improving your harness to address the problem every time you find yourself saying "X is where the work happens," you'll be moving in the right direction.
English
0
0
0
3
Joe Parker
Joe Parker@JWallaceParker·
Claude Code workflow for a new feature: write the spec section, point Claude Code at it, review the diff, reject what's wrong, accept what's right, repeat. The review is where the work happens.
English
1
0
0
11
John Davenport
John Davenport@johns10d·
@everythingLLM @ks458008 It's a combination of prompt engineering, validation, and orchestration. Harness engineering is an exercise in procedural coding as much as it is prompting. Hooks, test gates, and verification loops are code, not language.
English
1
0
0
23
Everything AI
Everything AI@everythingLLM·
@ks458008 Interesting framing. I'd push back though - harness engineering is really just prompt engineering at scale with better tooling. The underlying skill (guiding model behavior through language) hasn't changed, we've just wrapped it in fancier infrastructure.
English
2
0
0
20
John Davenport
John Davenport@johns10d·
@dbmarkley We're just moving up the stack here dude. I think the riches are in the niches. If you're broad, Anthropic or OpenAI is going to eat your lunch. The trick is to get really specific.
English
1
0
0
23
David Markley
David Markley@dbmarkley·
In a hilarious twist of fate, Anthropic appears to have launched Managed Agents ~10 minutes after I posted about building this all from scratch. The timing is impeccable. I'll be reading their docs tonight to see how much of my orchestrator just became unnecessary. x.com/claudeai/statu…
English
1
0
2
91