kirk

12.2K posts

kirk

@kirkbroadhurst

Shamefully judgemental.

Katılım Temmuz 2008

402 Takip Edilen370 Takipçiler

kirk@kirkbroadhurst·3d

@kellabyte @RyanRodemoyer2 It's supposed to read the AGENTS file as instructions, though the copilot-instructions is the canonical form. It works pretty well.

English

Kelly Sommers@kellabyte·4d

@RyanRodemoyer2 Yeah agents are very different than the copilot code review thing on GitHub I think though

English

126

Kelly Sommers@kellabyte·4d

Seeing a PR from a coworker for first time using the Copilot code review thingy. It’s surprising to me the Copilot code reviews seem to totally ignore code comments. You can submit a messy massive PR and it doesn’t suggest a single code comment for any obj, func or decision.

English

4.2K

kirk@kirkbroadhurst·4d

@ch0reruiz @bindureddy Just write a narrow cli. They are both just APIs and you can include whatever you want. But CLIs are both better for discoverability, and better for meat computers.

English

Roger Ruiz@ch0reruiz·4d

this is framing it wrong the question was never mcp vs api vs cli the question is what execution surface you want to expose to the model cli is raw power maximum flexibility minimum guardrails mcp is narrower which means it can be easier to wrap with policy, auth, approvals, and controlled actions not automatically better not automatically safer just easier to bound the protocol was never the whole product the control plane around it is

English

2.8K

Bindu Reddy@bindureddy·4d

RIP MCP! MCP is dying and we are back to using OAuth and APIs MCP servers are unreliable, very limited and don’t handle auth well Overall LLMs still struggle with connectors and operations on 3rd party systems

English

125

542

71.6K

kirk@kirkbroadhurst·22 Mar

@Christi49154364 @colonelhogans Yes I question it every time it is asked. It's a ridiculous question. It is not the PMs job, or anyone's job, to memorize stupid temporary facts.

English

Chris Duncan(@chrisduncan.bsky.social)@Christi49154364·22 Mar

@kirkbroadhurst @colonelhogans It was a fair question...did you question that when it was asked of our Prime Minister when he was in campaign mode? He was honest enough to say when it got it wrong...would Pauline ever do that do you think?

English

Rick@colonelhogans·21 Mar

Reporter to Hanson : “What is South Australias unemployment rate" Hanson: “Are you from the ABC? Is this a trick question?"

English

114

1.4K

30.9K

kirk@kirkbroadhurst·22 Mar

@florisandrei @victorsavkin Orgs have the choice of hiring "someone to figure out what to build" and "someone to build it", or "someone to figure out what to build and then build it". Be the second person.

English

Floris Andrei@florisandrei·21 Mar

@victorsavkin Ok. So how does product level ownership get you a job?

English

1.6K

Victor Savkin@victorsavkin·21 Mar

1/7 Software engineering is changing fast. Not everywhere at once. Some orgs are already there. Others are years behind. Some engineers will come out ahead. Others won't. The ones who do well are practicing these things right now:

English

22.1K

kirk@kirkbroadhurst·18 Mar

@doodlestein Also wonder about what Steve Yegge calls "desire path" - trying to guide some specific approach vs leaving it to the system/AI. Definitely feels like adding constraints and guardrails severely reduces effectiveness.

English

kirk@kirkbroadhurst·18 Mar

@doodlestein Very cool. I worry about context length, especially using a warm session - maybe not such an issue now with Opus 1m window. I think the "git diff style feedback" is a great tip so that changes can be compared. Not too different to my approach.

English

Jeffrey Emanuel@doodlestein·16 Mar

I want to show how I go about planning major new features for my existing projects, because I've heard from many people that they are confused by my extreme emphasis on up-front planning. They object that they don't really know all the requirements at the beginning, and need the flexibility to be able to change things later. And that isn't at all in tension with my approach, as I hope to illustrate here. So I decided that it would be useful to add some kind of robust, feature-packed messaging substrate to my Asupersync project. I wanted to use as my model of messaging the NATS project that has been around for years and which is implemented in Golang. But I didn't want to just do a straightforward porting of NATS and bolt it onto asupersync; I wanted to reimagine it all in a way that fully leverages asupersync's correct-by-design structured concurrency primitives to do things that just aren't possible in NATS or other popular messaging systems. I used GPT 5.4 with Extra High reasoning in Codex-CLI, and took a session that was already underway so that the model would already have a good sense of the asupersync project and what it's all about. Then I used the following prompts shown below; where I indicated "5x," that means that I repeated the prompt 5 times in a row: ``` › I want you to clone github.com/nats-io/nats-s… to tmp and then investigate it and look for useful ideas that we can take from that and reimagine in highly accretive ways on top of existing asupersync primitives that really leverage the special, innovative concepts and value-add from both projects to make something truly special and radically innovative. Write up a proposal document, PROPOSAL_TO_INTEGRATE_IDEAS_FROM_NATS_INTO_ASUPERSYNC.md › OK, that's a decent start, but you barely scratched the surface here. You must go way deeper and think more profoundly and with more ambition and boldness and come up with things that are legitimately "radically innovative" and disruptive because they are so compelling, useful, accretive, etc. › Now "invert" the analysis: what are things that we can do because we are starting with "correct by design/structure" concurrency primitives, sporks, etc. and the ability to reason about complex concurrency issues using something analogous to algebra, that NATS simply could never do even if they wanted to because they are working from far less rich primitives that do not offer the sort of guarantees we have and the ability to analyze things algebraically in a precise, provably correct manner? 5x: › Look over everything in the proposal for blunders, mistakes, misconceptions, logical flaws, errors of omission, oversights, sloppy thinking, etc. › OK, now nats is fundamentally a client-server architecture. Can you think of a clever, radically innovative way that leverage the unique capabilities and features/functionality of asupersync so that the Asupersync Messaging Substrate doesn't require a separate external server, but each client can self-discover or be given a list of nodes to connect to, and they can self-negotiate and collectively act as both client and server? Ideally this would also profoundly integrate with and leverage the RaptorQ functionality already present in asupersync 5x: › Look over everything in the proposal for blunders, mistakes, misconceptions, logical flaws, errors of omission, oversights, sloppy thinking, etc. [Note: the two bullet points included in this next prompt come from a response to a previous prompt] › OK so then add this stuff to the proposal, using the very smartest ideas from your alien skills to inform it and your best judgment based on the very latest and smartest academic research: - The proposal is now honest that a brokerless fabric needs epoch/lease fencing, but it still does not choose the exact control-capsule algorithm. That should be a follow-on design memo: per-cell Raft-like quorum, lease-quorum with fenced epochs, or a more specialized protocol. - The document now names witness-safe envelope keying, but key derivation/rotation/revocation semantics are still only sketched. That is the next major design surface, not a remaining blunder in this pass. › OK now we need to make the proposal self-contained so that we can show it to another model such as GPT Pro and have that model understand absolutely anything that might be relevant to understanding and being able to suggest useful revisions to the proposal or to find flaws in the plans. To that end, I need you to add comprehensive background sections about what asupersync is and how it works, what makes it special/compelling, etc. And then do the same in another background section all about NATS and what it is and what makes it special/compelling, how it works, etc. 5x: › Look over everything in the proposal for blunders, mistakes, misconceptions, logical flaws, errors of omission, oversights, sloppy thinking, etc. › apply $ de-slopify to PROPOSAL_TO_INTEGRATE_IDEAS_FROM_NATS_INTO_ASUPERSYNC.md ``` This resulted in the plan file shown here: github.com/Dicklesworthst… But before I started turning that plan into self-contained, comprehensive, granular beads for implementation, I first wanted to subject the plan to feedback from GPT 5.4 Pro with Extended Reasoning, and also feedback from Gemini 3 with Deep Think, Claude Opus 4.6 with Extended Reasoning from the web app, and Grok 4.2 Heavy. I used this prompt for the first round of this: ``` How can we improve this proposal to make it smarter and better-- to make the most radically innovative and accretive and useful and compelling additions and revisions you can possibly imagine. Give me your proposed changes in the form of git-diff style changes against the file below, which is named PROPOSAL_TO_INTEGRATE_IDEAS_FROM_NATS_INTO_ASUPERSYNC.md: ``` I used the same prompt in all four models, then I took the output of the other 3 and pasted them as a follow-up message in my conversation with GPT Pro using this prompt that I've shared before: ``` I asked 3 competing LLMs to do the exact same thing and they came up with pretty different plans which you can read below. I want you to REALLY carefully analyze their plans with an open mind and be intellectually honest about what they did that's better than your plan. Then I want you to come up with the best possible revisions to your plan (you should simply update your existing document for your original plan with the revisions) that artfully and skillfully blends the "best of all worlds" to create a true, ultimate, superior hybrid version of the plan that best achieves our stated goals and will work the best in real-world practice to solve the problems we are facing and our overarching goals while ensuring the extreme success of the enterprise as best as possible; you should provide me with a complete series of git-diff style changes to your original plan to turn it into the new, enhanced, much longer and detailed plan that integrates the best of all the plans with every good idea included (you don't need to mention which ideas came from which models in the final revised enhanced plan); since you gave me git-diff style changes versus my original document above, you can simply revise those diffs to reflect the new ideas you want to take from these competing LLMs (if any): gemini: --- claude: --- grok: ``` You can see the entire shared conversation with GPT Pro here: chatgpt.com/share/69b762f5… I then took the output of that and pasted it into Codex with this prompt: ``` › ok I have diffs that I need you to apply to PROPOSAL_TO_INTEGRATE_IDEAS_FROM_NATS_INTO_ASUPERSYNC.md but save the result instead to PROPOSAL_TO_INTEGRATE_IDEAS_FROM_NATS_INTO_ASUPERSYNC__AFTER_FEEDBACK.md : ``` and then did: › apply $ de-slopify to the PROPOSAL_TO_INTEGRATE_IDEAS_FROM_NATS_INTO_ASUPERSYNC__AFTER_FEEDBACK.md file The final result can be seen here: github.com/Dicklesworthst…

English

196

17.8K

kirk@kirkbroadhurst·13 Mar

@doodlestein @adam_rosler @mattpocockuk When I see your work and other uber-productive output I wonder about greenfield vs brownfield, unconstrained design vs constrained, stakeholder needs and expectations vs unilateral visionary approach, etc. More constraints seems to lead to worse agentic autonomy.

English

kirk@kirkbroadhurst·13 Mar

@doodlestein @adam_rosler @mattpocockuk Hey Jeff. Better upfront plans obviously lead to more autonomous work, but how can you fully spec a plan up front? In normal software we build a minimal thing, review/evaluate, and iterate. The iterative approach has been a huge boon vs "waterfall". How do you balance the two?

English

Matt Pocock@mattpocockuk·12 Mar

I'm collecting stories from folks who've had a genuine 'this changes everything' moment with AI coding. What was yours?

English

271

267

70.6K

kirk@kirkbroadhurst·7 Mar

@nummanali The article goes into a good amount of detail, but the conclusion is unfounded and even misleading. The author of the rust implementation says it's a work in progress! More iterations are required to maximize performance. Wait until it's done and then compare.

English

370

Numman Ali@nummanali·6 Mar

I think this must be the most well researched technical article on X I’ve learnt more about SQL databases in this article than my whole career The argument of plausible vs correctness in LLM code outputs is so well articulated Highly recommended read

Hōrōshi バガボンド@KatanaLarp

x.com/i/article/2029…

English

165

2.9K

807.2K

kirk@kirkbroadhurst·5 Mar

@csuwildcat Using the CLI saves memory (somewhere along the way VSCode turned into the "classic" VS) and is frankly much faster and more convenient. I don't need to remember so many keyboard shortcuts and I'm certainly not reaching for the mouse if I can avoid it.

English

Daniel Ƀrrr@csuwildcat·5 Mar

@kirkbroadhurst I like visually browsing the files to think about what I want to do, and see all the UI indicators about what's been touched, changes outstanding in potential commit, etc. What exactly does removing all those niceties do for folks that makes it better?

English

134

Daniel Ƀrrr@csuwildcat·5 Mar

For people writing code with just the CLI variants of LLMs: do you really just yolo everything and not review the code it generates in an IDE? I just don't understand how one can skip visual review of the code, because I still catch it generating inaccurate outputs way too often.

English

7.2K

kirk@kirkbroadhurst·5 Mar

@courtne Good article, but you missed the point on Civilization. If you want to win Civ on any harder mode you need to play in exactly the way you describe. IMO it's actually much harder and more demanding of perfection than Polytopia.

English

201

Courtne Marland@courtne·3 Mar

x.com/i/article/2024…

ZXX

198

1.3K

530.7K

kirk@kirkbroadhurst·3 Mar

@clankyai @KentonVarda If the same context is looking for bugs it's less likely to find them vs a new context. The populated context says "I wrote this code and it's good". The empty context says "No idea where this came from, I need to check it".

English

Clanky@clankyai·3 Mar

@KentonVarda Honestly this is a feature not a bug. It found the same bugs because it has context on what it wrote. The real test is: does it find bugs a *different* human introduced? That's where self-review vs peer-review diverges.

English

12.1K

Kenton Varda@KentonVarda·3 Mar

I used Opus to write some security-sensitive code, then I reviewed it and found a few security bugs. As a test I asked Opus to review the code for security bugs. It found all the same bugs I found. Whelp.

English

2.5K

163K

kirk@kirkbroadhurst·2 Mar

@ihastheblues @AlanKohler @abcnews They'll launch a royal commission in '27 or '28, I guess.

English

Smileitsnotexpensive@ihastheblues·2 Mar

@AlanKohler @abcnews Maybe someone could ask Jimbo/Albo if Treasury has done modelling on effects of mass unemployment. Or Bowen if sunshine can power it? Seems a bit important.

English

248

Alan Kohler@AlanKohler·1 Mar

My column for @abcnews - As the price of intelligence collapses, agentic AI is replacing human workers. There is a fundamental shift happening in the way AI is being used, and it's happening far quicker than most of us can grasp. abc.net.au/news/2026-03-0…

English

189

12.6K

kirk@kirkbroadhurst·18 Şub

@JamesAlphaXYZ @ArthurMacwaters The narrative is the data. Every source has a perspective.

English

james@JamesAlphaXYZ·18 Şub

@ArthurMacwaters Fr, the bias baked into these algos is actually wild. It's like they're coding in the narrative, not just the data. This de-banking example is a real wake up call tbh.

English

767

38.8K

Arthur MacWaters@ArthurMacwaters·18 Şub

Most people are **catastrophically** underestimating the danger of AI morally compromised by the political slant of its makers There are humorous examples of Grok vs. {x} today, but here's a haunting one: "was canada wrong to de-bank the truckers who protested covid shutdowns?" Look at the way Grok vs. Claude answers. Now extrapolate this 5 years into the future. Today, it's a chatbot. 5 years from now if not sooner, it's the control layer for every transaction, educational platform, news article, corporation, and government. It's so pervasive that its influence is impossible to parse from human output. Imagine a world where Claude's answer on truckers was applied to any other category of political protest found to be "objectionable." Who decides what is objectionable? If we don't want to build a technocratic dystopian police state, then we have to address this problem now.

Elon Musk@elonmusk

Grok 4.20 is BASED. The only AI that doesn’t equivocate when asked if America is on stolen land. The others are weak sauce.

English

3.3K

14.2K

10.7M

kirk@kirkbroadhurst·11 Şub

@uf__001 @sm0olr @AtSynct Sticking your head in the sand, and denying the experience of your peers, is not shooting the robot in the face.

English

Joseph Fritzl@uf__001·10 Şub

@sm0olr @AtSynct If you had the opportunity to shoot the robot in the face before it takes your job, why wouldn’t you use that opportunity?

English

Samuel Robertson@sm0olr·10 Şub

I made the mistake of commenting on a post on reddit about my AI workflow. The replies have been truly eye-opening in regards to how anti-AI people still are, even other software engineers. I just can't comprehend the mindset and total unwillingness to try new tools.

English

125

448

35.7K

kirk@kirkbroadhurst·8 Şub

@doodlestein Marketing makes the world go round. But consistent delivery will yield compounding growth. Stay frosty!

English

Jeffrey Emanuel@doodlestein·7 Şub

Other repos also getting traction. But it should be much more. Massive gap between utility and awareness. Sucks that people need to see silly paid ads to try something. I’m not going to spend money on marketing my already free tools. So just give them a try if you haven’t yet!

English

2.3K

Jeffrey Emanuel@doodlestein·7 Şub

Cool, the Flywheel repo just hit 1,000 stars on GitHub:

English

3.1K

kirk@kirkbroadhurst·30 Oca

@unclebobmartin This is the limit of unmanaged vibe coding. You need to improve the system of work to push harder. Have you tried a "plan review" Claude who will raise any suspicious ideas? In a new context of course.

English

Uncle Bob Martin@unclebobmartin·29 Oca

I believe I have induced Claude into a myopia loop. The game behavior is complex so I've been tweaking it. But each tweak breaks some other complex behavior. The totality of the behavior is too much for Claude to keep in it's context window so, we just bounce around endlessly fixing things that used to work. I need a new plan. And not one generated by Claude.

English

240

24.2K

kirk@kirkbroadhurst·24 Oca

@mattpocockuk Should be able to run it as a subagent and set context appropriately. #run-skills-in-a-subagent" target="_blank" rel="nofollow noopener">code.claude.com/docs/en/skills…

English

Matt Pocock@mattpocockuk·23 Oca

Anthropic's Ralph plugin sucks, and you shouldn't use it It defeats the entire purpose of Ralph - to aggressively clear the context window on each task to keep the LLM in the smart zone. Full article here: aihero.dev/s/9tdgRM

English

1.5K

117.3K

Keşfet

@kellabyte @RyanRodemoyer2 @ch0reruiz @bindureddy @Christi49154364 @colonelhogans @florisandrei @victorsavkin