Charlie Brewer
10.5K posts

Charlie Brewer
@charbrew
AI Transformation Leader | Certified MindStudio AI Agent Developer | Director of Product Design | Product Manager | AdTech CreatorTech Streaming PropTech EdTech



The PM job used to be "figure out what's possible, then plan around it for 6 months." That assumption worked when the technology underneath your product moved slowly. Cat Wu runs product for Claude Code at Anthropic. She tested every new model by asking it to add a table tool to Excalidraw. Sonnet 3.5 failed. Opus 4 occasionally succeeded. Opus 4.6 does it reliably enough to demo live in front of thousands of developers. That progression happened in 16 months. METR measures this with time horizons: how long would a task take a human expert that AI can now complete half the time? Sonnet 3.5 (new) in October 2024: 21 minutes. Opus 4.6 in February 2026: roughly 14.5 hours. A 41x jump. If your roadmap is longer than the gap between model releases, you're planning around constraints that may not exist by the time you ship. Her team's response is worth studying. They replaced long-term roadmaps with "side quests," short self-directed experiments anyone on the team can run. Claude Code on Desktop, the AskUserQuestion tool, and todo lists all started this way. Someone prototyped it, internal users liked it, they shipped it. The most telling detail: when they first launched todo lists, the model couldn't reliably check off completed items. They added system prompt hacks to nudge it. Next model generation, the behavior came for free. They deleted the hacks. Their system prompt shrank 20% with Opus 4.6 alone. This is the part most PMs miss. Every workaround you build to compensate for a model limitation becomes dead weight the moment the next model drops. The simpler your implementation, the faster you absorb the next capability jump. The Venn diagram in the image tells the structural story. Before AI: Product hands to Design hands to Eng, sequential. With AI: all three overlap. Designers ship code. Engineers make product calls. PMs build prototypes. The handoff chain collapses because the cost of building a working demo dropped to an afternoon. Any PM still writing 30-page PRDs before touching a prototype is optimizing for a world where building is expensive. That world ended about 12 months ago.


Karpathy's autoresearch repo has 42K stars. Most PMs closed the tab thinking it wasn't for them. I pointed it at a Claude Code skill. 41% to 92% in 4 rounds while I slept. 6 use cases, 10 eval templates, and a downloadable toolkit. 🔗 news.aakashg.com/p/autoresearch…





















