Daniel says it won’t scale

2.3K posts

Daniel says it won’t scale banner
Daniel says it won’t scale

Daniel says it won’t scale

@Daniel0x6019

Cloud guy. Senior developer. Expert friendly.

Katılım Ekim 2018
696 Takip Edilen79 Takipçiler
spidey
spidey@lochan_twt·
The day a blind man sees. The first thing he throws away is the stick that has helped him all his life
spidey tweet media
English
660
4.4K
60.5K
2.6M
Daniel says it won’t scale retweetledi
Raul Junco
Raul Junco@RaulJuncoV·
Every time I see a team celebrating their new "shared module," I remember this lesson. Reuse is a dangerous form of coupling. They found the same logic in two places and did what good engineers do: put it in one place and called it a win. Clean, responsible, textbook. Six months later, someone needs to change it. Suddenly, a small update for one team's requirements breaks three services, blocks two releases, and triggers an emergency meeting between people who've never talked to each other before. This is the cost nobody preaches about. DRY is one of those principles that feels unquestionably right until you apply it across team boundaries. The moment you share a module between domains, you're not just sharing code. You're creating a dependency that nobody owns and everyone resents. Before you reuse, ask: Will this change often? Does it belong to one domain? Are the consumers truly aligned in purpose? Will one team’s change surprise another team? If the answer to any of these is "I'm not sure," stop. Duplicate it. I know how that sounds. It feels lazy. It feels like the thing a junior developer does before they know better. But here's what nobody wants to say out loud: two independent implementations you control are almost always cheaper than one shared one serving masters with different goals. Duplication is a local problem. Coupling is an organizational problem. One of them you can fix in an afternoon. The other requires a meeting with five teams and someone's manager. Reuse isn't free. Treat it like the trade-off it is.
Raul Junco tweet media
English
30
69
414
25.4K
Daniel says it won’t scale retweetledi
Gary Marcus
Gary Marcus@GaryMarcus·
Some things never change. If you don’t understand this one, you don’t understand what’s happening AI. Marcus, 1998: neural nets have trouble generalizing far beyond the data. Marcus, 2001, 2012, 2019, 2022, etc: neural nets have trouble generalizing far beyond the data. Apple, 2025: neural nets have trouble generalizing far beyond the data. Meta/Stanford/Harvard, 2026: neural nets have trouble generalizing far beyond the data.
Deedy@deedydas

The creators of SWE-Bench just dropped a really simple new benchmark every LLM gets 0% on. ProgramBench asks: can models recreate real executable programs (ffmpeg, SQLite, ripgrep) from scratch with no internet? We are far from saturated on model quality.

English
101
360
2.6K
356.9K
Corey Quinn
Corey Quinn@QuinnyPig·
Been a while since we played this game: Give me a vendor, I’ll tell you what their AWS re:Invent booth swag should be.
English
37
4
37
18.2K
Darren Shepherd
Darren Shepherd@ibuildthecloud·
Because it's not an abstraction layer. The agent is not responsible for the code, the engineer is. You get fired, not the agent. Your value is specifically that you can be fired. I'm serious. Your job is to make sure the agent doesn't go south. Are you going to risk your livelihood by not ever knowing what the agent is doing.
English
1
0
9
455
Daniel says it won’t scale retweetledi
techknight
techknight@techknight2·
well Ask.com aka Ask Jeeves kinda just went silently into the night
techknight tweet media
English
83
1.1K
6.2K
294K
Sara Mauskopf
Sara Mauskopf@sm·
omg how do I get Claude to stop asking me permission?? I changed all my settings to not ask but it still asks constantly!!!!! Gonna threaten to replace him with a human if he doesn't shape up!!
English
8
0
25
13.9K
Daniel says it won’t scale retweetledi
Lenny Rachitsky
Lenny Rachitsky@lennysan·
My biggest takeaways from Claude Code's Head of Product @_catwu: 1. Anthropic’s product development timelines have gone from six months to one month, sometimes one week, sometimes one day. Part of this acceleration is access to the latest models (i.e. Mythos). Another is shipping new products into “research preview,” making clear it's early, experimental, and might not be supported forever. Another is an evergreen "launch room "where engineers post ready features and marketing turns around announcements the next day. 2. The PM role is shifting from coordinating multi-month roadmaps to enabling teams to ship daily. As Cat puts it, “There should be less emphasis on making sure you are aligning your multi-quarter roadmaps with your partner teams and more emphasis on, OK, how can we figure out the fastest way to get something out the door?” 3. The most efficient shipping unit is an engineer with great product taste. On Cat’s team, many engineers go end-to-end—from seeing user feedback on Twitter to shipping a product by the end of the week—without a PM involved. Also, almost all the PMs on the Claude Code team have either been engineers or ship code themselves, and the designers have been front-end engineers. The roles are merging, and the most valuable skill is product taste, not job title. 4. Build products that are on the edge of working. Claude Code’s code review product failed multiple times because earlier models weren’t accurate enough. But because the prototype was already built, they could swap in Opus 4.5 and 4.6 and immediately test whether the gap was closed. Teams that wait for the model to be ready will always be a cycle behind. 5. The most underrated skill for building AI products is asking the model to introspect on its own mistakes. Cat regularly asks the model why it made an unexpected decision. The model will explain that something in the system prompt was confusing, or that it delegated verification to a subagent that didn’t check its work. This reveals what misled the model so the team can fix the harness. 6. Every model release forces their team to revisit existing products and audit their system prompt to remove features the model no longer needs. Claude Code’s to-do list was a crutch for earlier models that couldn’t track their own work. With Opus 4, the model handles it natively. Features built as scaffolding for weaker models become debt when the model catches up—so the team actively strips them. 7. Anthropic employees build custom internal tools instead of buying SaaS products. A sales team member built a web app that pulls from Salesforce, Gong, and call notes to auto-customize pitch decks—work that used to take 20 to 30 minutes now takes seconds. Their core stack is Claude Code, Cowork, and Slack. No Notion, no Linear, no Figma. 8. People underestimate how much Claude’s personality contributes to its success. As Cat describes it, “When you reflect on everyone you’ve worked with, there’s just some people where you’re like, I really like their energy, their vibe.” Claude is designed to be low-ego, positive, competent, and earnest—qualities that make it feel like a great coworker, not just a tool. This isn’t cosmetic; it’s what makes people want to use Claude for hours every day. The team has a dedicated person, Amanda, who “molds Claude’s character,” and it’s one of the hardest roles at the company because success is so subjective. 9. The future of work is managing fleets of AI agents, not doing the work yourself. Cat sees a clear progression: first, individual tasks become successful. Then people start running multiple tasks at the same time (multi-Clauding). Next, people will run 50 or 100 tasks simultaneously, which will require new infrastructure—remote execution, better interfaces for managing tasks, agents that fully verify their work, and self-improving systems that incorporate feedback. The human role shifts from doing the work to knowing which tasks to look into, verifying outputs, and giving feedback that makes the system better over time. 10. Hire people who lean into chaos and face every challenge with a smile. At Anthropic, there are weeks when a P0 on Sunday becomes a P00 by Monday and a P000 by Monday afternoon. If you get too stressed about any one thing, you’ll burn out. Their team looks for people who can look at a hard challenge and say, “Wow, that’s gonna be hard. But I’m excited to tackle it and I’m gonna do the best that I possibly can.” This mindset—optimism, resilience, and comfort with constant change—is increasingly essential as the pace of AI development accelerates. Don't miss the full conversation: youtube.com/watch?v=Pplmzl…
YouTube video
YouTube
Lenny Rachitsky@lennysan

How Anthropic’s product team moves faster than anyone else I sat down with @_catwu, Head of Product for Claude Code at @AnthropicAI, to get a peek into their unprecedented shipping pace, how AI is changing the PM role, and how to be the right amount of AGI-pilled. We discuss: 🔸 How Anthropic’s shipping cadence went from months to weeks to days 🔸 The emerging skills PMs need to develop right now 🔸 Why you should build products that don't work yet—then wait for the model to catch up 🔸 Why a 95% automation isn't really an automation 🔸 Cat’s most underrated AI skill (introspection) 🔸 What Cat actually looks for when hiring PMs now (hint: it's not traditional PM skills) Listen now 👇 youtu.be/PplmzlgE0kg

English
99
297
2.9K
839.9K
Daniel says it won’t scale retweetledi
Elias Al
Elias Al@iam_elias1·
MIT just made every AI company's billion dollar bet look embarrassing. They solved AI memory. Not by building a bigger brain. By teaching it how to read. The paper dropped on December 31, 2025. Three MIT CSAIL researchers. One idea so obvious it hurts. And a result that makes five years of context window arms racing look like the wrong war entirely. Here is the problem nobody solved. Every AI model on the planet has a hard ceiling. A context window. The maximum amount of text it can hold in working memory at once. Cross that line and something ugly happens — something researchers have a clinical name for. Context rot. The more you pack into an AI's context, the worse it performs on everything already inside it. Facts blur. Information buried in the middle vanishes. The model does not become more capable as you feed it more. It becomes more confused. You give it your entire codebase and it forgets what it read three files ago. You hand it a 500-page legal document and it loses the clause from page 12 by the time it reaches page 400. So the industry built a workaround. RAG. Retrieval Augmented Generation. Chop the document into chunks. Store them in a database. Retrieve the relevant ones when needed. It was always a compromise dressed up as a solution. The retriever guesses which chunks matter before the AI has read anything. If it guesses wrong — and it does, constantly — the AI never sees the information it needed. The act of chunking destroys every relationship between distant paragraphs. The full picture gets shredded into fragments that the AI then tries to reassemble blindfolded. Two bad options. One broken industry. Three MIT researchers and a deadline of December 31st. Here is what they built. Stop putting the document in the AI's memory at all. That is the entire idea. That is the breakthrough. Store the document as a Python variable outside the AI's context window entirely. Tell the AI the variable exists and how big it is. Then get out of the way. When you ask a question, the AI does not try to remember anything. It behaves like a human expert dropped into a library with a computer. It writes code. It searches the document with regular expressions. It slices to the exact section it needs. It scans the structure. It navigates. It finds precisely what is relevant and pulls only that into its active window. Then it does something that makes this recursive. When the AI finds relevant material, it spawns smaller sub-AI instances to read and analyze those sections in parallel. Each one focused. Each one fast. Each one reporting back. The root AI synthesizes everything and produces an answer. No summarization. No deletion. No information loss. No decay. Every byte of the original document remains intact, accessible, and queryable for as long as you need it. Now here are the numbers. Standard frontier models on the hardest long-context reasoning benchmarks: scores near zero. Complete collapse. GPT-5 on a benchmark requiring it to track complex code history beyond 75,000 tokens — could not solve even 10% of problems. RLMs on the same benchmarks: solved them. Dramatically. Double-digit percentage gains over every alternative approach. Successfully handling inputs up to 10 million tokens — 100 times beyond a model's native context window. Cost per query: comparable to or cheaper than standard massive context calls. Read that again. One hundred times the context. Better answers. Same price. The timeline of the arms race makes this sting harder. GPT-3 in 2020: 4,000 tokens. GPT-4: 32,000. Claude 3: 200,000. Gemini: 1 million. Gemini 2: 2 million. Every generation, every company, billions of dollars spent, all betting on the same assumption. More context equals better performance. MIT just proved that assumption was wrong the entire time. Not slightly wrong. Fundamentally wrong. The entire premise of the last five years of context window research — that the solution to AI memory was a bigger window — was the wrong answer to the wrong question. The right question was never how much can you force an AI to hold in its head. It was whether you could teach an AI to know where to look. A human expert handed a 10,000-page archive does not read all 10,000 pages before answering your question. They navigate. They search. They find the relevant section, read it deeply, and synthesize the answer. RLMs are the first AI architecture that works the same way. The code is open source. On GitHub right now. Free. No license fees. No API costs. Drop it in as a replacement for your existing LLM API calls and your application does not even notice the difference — except that it suddenly works on inputs it used to fail on entirely. Prime Intellect — one of the leading AI research labs in the space — has already called RLMs a major research focus and described what comes next: teaching models to manage their own context through reinforcement learning, enabling agents to solve tasks spanning not hours, but weeks and months. The context window wars are over. MIT won them by walking away from the battlefield. Source: Zhang, Kraska, Khattab · MIT CSAIL · arXiv:2512.24601 Paper: arxiv.org/abs/2512.24601 GitHub: github.com/alexzhang13/rlm
Elias Al tweet media
English
147
446
2.2K
325.3K
Approved
Approved@puzzle_seeker·
@unclebobmartin Our issue is unreliable outcome. We tried with writing rules what it should never do, and it still does, it is aware that it breaks them, and according to it, there is nothing that can guarantee that rules won't be broken. I am not sure how to overcome this.
English
3
0
1
382
Uncle Bob Martin
Uncle Bob Martin@unclebobmartin·
I've seen a lot of posts complaining that AI is non-deterministic. This is true, but my experience is that AIs can be constrained to be very nearly deterministic. Some might say "very nearly" is not good enough. My response is that I believe I can crank up the constraints to reduce the uncertainty to below any given threshold. I'd also like to point out that the functioning of your body is based on the statistical non-deterministic behavior of random molecular motion. The second law of thermodynamics is statistical in nature and only approximately deterministic above a certain threshold. Indeed, our muscles and nerves would not function correctly if the second law was entirely deterministic. So, your heart beats, and your neurons fire, because of non-determinism. Non-determinism, properly constrained, is something we can all live with.
English
102
48
564
39.1K
Uncle Bob Martin
Uncle Bob Martin@unclebobmartin·
AIs are just another step up the semantic expression ladder. We initially expressed our semantics in binary, then assembler, then Fortran, then C, then Java, then Python, etc. AI is just the next step up that same old ladder. And when you take that step, nothing else changes. You are still expressing behavioral semantics. You still need to express structural semantics. All the old principles still apply. You still have to be concerned about design and architecture. And even though the syntax allows informal statement, you cannot abandon formalism. When you express behavior you need a formal way to enforce the behavior you want. I use Gherkin for this. It seems to work pretty well. Consider that Gherkin is written in triplets of Given/When/Then. Each of those GWT triplets is a transition of a state machine. A full suite of Gherkin triplets is a formal description of the finite state machine that represents the behavior of the application. Other formalisms that matter are things like module dependency graphs, testing constraints, complexity constraints, and many others. This step up the semantic expression ladder provides you with an enormous amount of options. But you'd better choose those options wisely!
English
56
72
662
36.2K
Corey Quinn
Corey Quinn@QuinnyPig·
I have spent my career decoding AWS pricing and this is genuinely one of the most aggressively complex things I've ever seen them publish. It's like they took data transfer pricing and said "what if we added a tier system nobody asked for and made it change based on topology?"
English
7
4
97
18.7K
Daniel says it won’t scale
Daniel says it won’t scale@Daniel0x6019·
@unclebobmartin @plainionist I love this. My engineering school started with the basics as well. Two years of heavy classic math, algebra chemistry and physics. We built an ALU during one class. The more I remember, the more I see it was the right path.
English
0
0
1
11
Uncle Bob Martin
Uncle Bob Martin@unclebobmartin·
@plainionist Learn the basics. Just as before. Learn them very well. Write assembler, C, Java, Ruby. Learn algorithms and data structures. Read the old classics. And then start using agents. Novices with power tools tend to lose fingers.
English
26
65
768
20K
Seb
Seb@plainionist·
Seriously curious: What would you recommend for someone starting a tech career in 2026? 🤔
English
23
1
68
17.3K
Daniel says it won’t scale retweetledi
Abhishek Singh
Abhishek Singh@0xlelouch_·
CTO: We lost our strongest backend engineer today. Founder: The one handling infra and outages? CTO: Yes. Founder: Did a bigger company hire him? CTO: No. Founder: Then why quit? CTO: He said he was exhausted. Founder: From the workload? CTO: Not exactly. From watching the same database bottleneck, same queue lag, same deployment mistakes come back every month. Founder: That happens in fast moving teams. CTO: He agreed. What he could not accept was that every fix was temporary because nobody wanted to slow down and clean the system properly. Founder: We had deadlines. CTO: He had standards. Founder: So he left because the work was hard? CTO: No. He left because he was not doing engineering anymore. He was just containing damage. The best engineers do not hate hard problems. They hate preventable problems that management keeps normalizing.
Javarevisited@javarevisited

Manager: We lost our best engineer today. CEO: The one leading payments? Manager: Yes. CEO: Did another company offer more money? Manager: No. CEO: Then why leave? Manager: He said he was tired of fixing the same production issues every week. CEO: That’s part of the job. Manager: He didn’t mind fixing issues. He minded that nobody wanted to fix the root cause. CEO: We prioritized speed. Manager: He wanted quality. CEO: So he left over that? Manager: He left because he felt like a firefighter, not an engineer. Good engineers don’t just want to solve problems. They want to eliminate them.

English
63
337
3.7K
819.8K
Daniel says it won’t scale retweetledi
Nic Wortel
Nic Wortel@nicwortel·
The PHP internals team has voted 38-4 to deprecate all OOP constructs in PHP 9.0. The reason: LLMs produce 34% fewer errors on procedural codebases. SOLID principles cause context overload in 78% of tested models. `__construct()` is the #1 source of LLM hallucinations in PHP. #laravel and #symfony are assessing the impact on their roadmaps. WordPress is already compatible. How are you preparing your codebase? #php #oop #ai #llm
Nic Wortel tweet media
English
84
75
795
140K
Corey Quinn
Corey Quinn@QuinnyPig·
I'm encouraging all of my @awscloud friends to sign up for my online course, "Defence Against the Leadership Principles."
Corey Quinn tweet media
English
8
2
48
4.8K