
Jeremy
595 posts

Jeremy
@n3lson
I build and test agent systems on real machines. Local AI, harnesses, failure modes.




>started treating my Openclaw/Hermes like patients. >the bugs that kept coming back all had real medical names. take the one everyone calls "AI hallucination." medically it's confabulation. >now i scan for the organ that failed. gbrain handles memory. OpenClaw approvals handle actions. trajectory bundles handle self-check. >a healthier agent isn't a smarter brain. it's a more complete body. >a few weeks in, you stop blaming "the model." you find the broken organ and patch it.


Within 6 to 12 months, every software product will need an API, MCP, and CLI. More and more, people expect to be able to interact with your product through automation, AI and agents. Historically, platform was a later stage of maturity play. Going forward, you won't really thrive in this new world without a platform.



Agents don't need types. They're perfectly capable of pulling off incredible refactorings without. Give them a linter and a test suite, and you have all you need. Token efficiency is where it's at.


As a designer, I've designed with code almost every single day this past year, across my company & personal projects. In 2024, I only made 84 commits, and much of my work still lived in Figma/Adobe. A lot changed - AI tooling obviously helped. But more than anything, I realized I’ve never had this much fun designing. It has become a hobby rather than just work. I’m learning faster, iterating more, and making design decisions across more layers of abstraction: interface, system, interaction, infrastructure. Designing with code is not for everyone. It's just a part of the modern toolkit - and Figma, Adobe, and many other tools are still there to help. But for me, it has made design feel fluid again, and brought back the same spark I felt when learning Figma for the first time years ago: the realization that I can play an active role in shaping real, helpful products. Design is getting exponentially closer to its “magic wand” moment, and I’m all for it.



🚨 THE AI COST CRISIS HAS STARTED. Microsoft reportedly told engineers to stop using Claude because AI bills were exploding, while Uber says its entire yearly AI budget was already destroyed by April.


You might believe you should spend less time thinking about code because of AI. I strongly disagree! We’re watching this play out live where tons of AI generated code becomes a liability. At the end of the day, an engineer needs to be responsible / on call for code that gets shipped to production. If you don’t understand the system you’re trying to debug, you’re probably going to have a bad time. Yes, AI can help with all of this, if you set up the proper systems. You can have agents triage prod logs, look at errors, etc. You can speed up parts of the investigation, but an engineer needs to make the call. There might be serious customer or financial implications from that change. I expect the trend continue for trimming dependencies, vendoring code so you can modify it directly, preferring simpler systems with fewer abstractions, and spending waaaay more time thinking about system design and code maintenance. I’ve said this before, but it’s a great time to get familiar with CS fundamentals and some of the history behind what great software looks like. Many parts will be different in the coming years as AI progresses, but also a lot more than people realize will stay the same.



every coding agent looks like a senior dev until you ask it to use an ORM. a new paper shows that adding a real database and architecture rules drops agent pass rates by 30%, with cross-file consistency hitting a brutal 8%. we didn't build autonomous engineers, we built a machine that writes single-file flask apps and panics the second it touches a data layer.


In a recent batch talk, YC General Partner @t_blom broke down how to build a self-improving, AI-native company. He walks through how to create recursive, self-improving AI loops, and why founders who get this right will run companies that improve while they sleep. 00:00 — Companies Are Roman Legions 00:54 — Copilots Are the Wrong Mental Model 01:55 — Extract the Domain Knowledge 02:24 — The Recursive Self-Improving Loop 04:12 — The Holy Shit Moment at YC 05:50 — Self-Optimizing Product and Support Loops 06:29 — Burn Tokens, Not Headcount 07:23 — Middle Management Is Over 08:05 — Make Everything Legible to AI 09:40 — Regenerating the YC User Manual 11:19 — Software Is Ephemeral, Context Is Valuable 12:18 — Where Humans Still Matter


She literally broke down how to run evals in Claude Code (built the whole thing live): 01:34 - What people get wrong with evals 04:35 - Why product taste is the alpha now 09:28 - Building a PM agent from one prompt 19:00 - Instrumentation without writing code 22:00 - Watching traces stream in live 28:00 - Getting Claude to write your first eval 33:58 - When vibe evals work and when they don't 48:50 - The self-improving loop (this part is wild) 01:03:00 - Same-day shipping is real 01:06:00 - The context graph unlock




Q: How are job postings for software engineers rising rapidly despite AI agents automating coding? A: Because there’s far more code to manage than ever before. We’re already seeing a 14x YoY increase in GitHub commits, and it’s accelerating. AI has dramatically lowered the cost of writing code, so it’s now being used across far more businesses, applications, and use cases. We’re at the beginning of a massive productivity boom driven by the proliferation of bespoke software throughout the entire economy. Coding has been AI’s breakout use case this year. The fact that it’s increased demand for software engineers — rather than decreased it — should call into question the entire “AI will cause mass job loss” narrative.


We’ve automated every single thing we can @every with AI agents. And yet there’s way more human work to do than ever. We’ve gone from 4 -> 30 human employees since GPT-3. I wrote a report on the structural reasons: how AI makes expert competence cheap, why that drives up demand for experts, and why the dynamic only intensifies as we approach AGI. After Automation: every.to/p/after-automa…







