Jonathan Malkin 🦊 | Building with Claude

1K posts

Jonathan Malkin 🦊 | Building with Claude banner
Jonathan Malkin 🦊 | Building with Claude

Jonathan Malkin 🦊 | Building with Claude

@builtwithjon

20yr enterprise tech → education & community founder. Building the whole thing with Claude as co-founder. AI in production, not theory. Austin 🦊

Austin Katılım Temmuz 2019
221 Takip Edilen56 Takipçiler
Paweł Huryn
Paweł Huryn@PawelHuryn·
@builtwithjon I'm not sure you can fix it completely differently. The traditional memory works only if feed your agent with carefully curated facts.
English
1
0
0
14
Paweł Huryn
Paweł Huryn@PawelHuryn·
RE: Memory Greg's right on the trajectory. But "memory" by itself fails at month three. Every artifact gets appended. Contradictions flatten into fake consensus. The agent drifts silently. What compounds is a hierarchy: raw source (immutable) → working memory (tagged observations) → durable knowledge (active hypotheses, synthesized facts, committed decisions, stakeholder state) Promotion criteria, signal strength, relevance, and trust earn move up. Then a weekly sweep: promotes what's confirmed, surfaces contradictions instead of collapsing them, compresses patterns, archives what shipped. "Starting today vs starting in 6 months" is real. But the unfair advantage only compounds if the loop runs. Without the sweep, you're storing, not learning. (Shipping PM Brain OS this week.)
GREG ISENBERG@gregisenberg

More AI agent observations below (I keep adding to the list): 1. Hermes agents write to their own memory after every task. Which means starting today versus starting in 6 months is an unfair advantage for you. 2. We're maybe 12 months from an agent that can watch you work for a week and then do your job without any instructions. The screen recording plus agent memory plus local model combination makes this possible right now 3. The real reason local models matter for founders: you can ship a product where the AI runs entirely on the customer's device and you never touch their data. Zero privacy concerns. Zero server costs. Zero compliance headaches. That changes which industries you can sell to overnight. Healthcare, legal, finance, all the regulated verticals that won't send data to the cloud just opened up. 4. Every company needs to be rebuilt as a "second brain" before agents can be useful. That means every process, every decision, every piece of institutional knowledge has to exist in a format an agent can read. Most companies have none of this. 5. Agent costs are the new headcount. Won't be crazy for companies to spend 50%+ of their total headcount cost on tokens. 6. Agents are accidentally creating internal competition at companies. The marketing agent and the sales agent are optimizing for different metrics and working against each other without anyone realizing it. It took humans decades to develop cross-functional alignment. Nobody thought about it for agents. 7. The YAML config file is becoming the new org chart. Who reports to who, what permissions they have, what tools they access, all defined in a config file. The company's structure is literally a file you can version control, fork, and deploy. That's new. 8. The first agents that can smell a scam are going to be worth billions. Right now agents will happily wire money to a fake invoice because it matched the format. The trust layer is completely missing. 9. We're about to find out that most "expertise" was actually just memory. Knowing the tax code. Knowing the case law. Knowing which supplier charges what. When an agent holds all of that in context, the expert's value shifts from "I know things" to "I know which things matter." Much smaller group of people. 10. We're all running the same models. The differentiation is in what you feed them. Two founders with the same agent, same model, same tools will get wildly different results based purely on the quality of their knowledge base. Garbage context in, garbage output out. Forever. 11. The most underbuilt category in AI right now: agents for old people. 70 million boomers who need help with medical forms, insurance claims, and appointment scheduling. 12. Agent latency is the new page load speed. If your agent takes 45 seconds to respond, your customer already switched to one that takes 13. Skills files are the new apps. A SKILL.md that tells an agent how to do one thing well is more valuable than a SaaS subscription that does the same thing behind a login screen. 14. AI hardware... how do you create devices that are good businesses that people want? It'll be a $30 dongle you plug into existing dumb devices to give them an agent brain. Smart toaster doesn't need to be built from scratch. It needs a $30 brain attached to a $15 toaster. 15. Your agent can read faster than you can think. The bottleneck in every agent workflow is now the human approval step. We're the slow part. That's a strange thing to sit with. 16. Agents made the 80/20 rule violent. The 20% of work that matters is now the only work humans do. The 80% just disappeared. Entire job descriptions were hiding inside that 80%. 17. The thing I keep coming back to: the best businesses right now are being built by people who are just slightly ahead of their customers. Not 10 years ahead. 6 months ahead. That's the sweet spot. Far enough to lead. Close enough to be understood.

English
1
0
2
930
Jacob Klug
Jacob Klug@Jacobsklug·
I built a Claude Cowork OS that replaces OpenClaw and runs on autopilot. Manages my business & personal life tasks. I created the whole playbook so you can re-build it tonight. What's inside: • The exact foundation prompt • 3 level orchestration map • Memory template for global context • Routing table for file management • Starter workstations (finance, content, community, habits) • Project file structure • Single prompt that builds the entire folder tree Follow + Comment 'OS' and follow. I'll DM it to you.
Jacob Klug tweet media
English
479
34
340
35.5K
Paweł Huryn
Paweł Huryn@PawelHuryn·
@kir_varlamov Two layers. There's the harness Anthropic ships, and the system you build around it - knowledge, skills, subagents. Saying the second one is their job is like saying they own the software your agent writes.
English
0
0
2
552
Jonathan Malkin 🦊 | Building with Claude
The strangest finding: GPT-5.4 on single-turn prompts: 38.6s GPT-5.4 on 13-turn agent workloads: 927.6s That's a 24x penalty. Highest of any model in the set. Kimi K2.6's penalty: 8.75x. MiniMax: 10.9x.
English
1
0
0
12
Jonathan Malkin 🦊 | Building with Claude
Two open-weight models quietly caught GPT-5.4 this month. Kimi K2.6 and MiniMax M2.7 are now within 5 quality points on writing. Both 2x faster on agent tasks. One is half the price. The frontier is getting crowded.
English
1
0
0
17
Jonathan Malkin 🦊 | Building with Claude
Loving Bench Loop for testing local models. Finally something decent for basic testing! Still would like to see some more advanced end to end workflow testing but this is a great start. Have you seen any more in-depth scenario testing local benchmark apps?
Jonathan Malkin 🦊 | Building with Claude tweet media
English
1
1
3
1.7K