Chain of Thought Podcast

32 posts

Chain of Thought Podcast banner
Chain of Thought Podcast

Chain of Thought Podcast

@chain_ofthought

AI in production: real talk with engineers, founders & researchers building breakthrough systems. Hosted by @ConorBronsdon. New podcast episodes weekly.

Katılım Kasım 2025
3 Takip Edilen16 Takipçiler
Sabitlenmiş Tweet
Chain of Thought Podcast
Chain of Thought Podcast@chain_ofthought·
Thanks to all of our listeners for an incredible first full year 🥳
Chain of Thought Podcast tweet media
English
1
1
2
287
Chain of Thought Podcast retweetledi
Conor Bronsdon
Conor Bronsdon@ConorBronsdon·
Dan Klein spent 5 years at Microsoft watching AI prototypes die on the vine. His diagnosis: we built plausibility engines, not truth engines. But to get to production, we need reliability - that's our expectation. Oh and the control surface (prompting) has no precise semantics to boot. Getting to 99% reliability is a huge challenge for AI systems. So Dan co-founded @ScaledCognition and built their highly reliable APT1 model for under $11M - after winning the ACM Grace Murray Hopper Award and having two previous startups acquired by AOL & Microsoft. Watch my newest @chain_ofthought ep with Dan Klein: 0:00 Cold open: RL is about doubling down on what works 0:28 Introducing Dan Klein and Scaled Cognition 2:53 The demo-to-production gap: why AI prototypes die 5:40 Why prompting is not a real control surface 8:06 Modular decomposition vs. end-to-end optimization 10:55 Are LLMs fundamentally mismatched with how we use them? 14:26 What's wrong with benchmarks today 20:27 APT1: building a model for actions, not tokens 24:14 What makes data truly agentic 28:02 Hallucinations as an iceberg 34:16 Building a prototype model for under $11 million 39:57 Applying RL to conversations without a zero-sum winner 43:31 LLMs as a condensation of the web 50:07 Reasoning models: where they work and where they don't 53:04 Early deployments in regulated industries 57:14 Why multi-model checking fails 1:00:34 The minimum bar for trustworthy agentic systems 1:04:07 Societal risk: when AI output is indistinguishable from truth 1:13:33 Where Dan is inspired in AI research today
English
3
3
8
585
Chain of Thought Podcast retweetledi
Conor Bronsdon
Conor Bronsdon@ConorBronsdon·
Dan Klein, co-founder & CTO of @ScaledCognition and ACM Grace Murray Hopper Award winner joined me on @chain_ofthought to break down why LLMs are fundamentally plausibility engines and how his team built their APT1 for under 11 million dollars 👇 youtu.be/2HwtyPE6JuQ?si…
YouTube video
YouTube
English
2
5
15
1.9K
Chain of Thought Podcast retweetledi
Conor Bronsdon
Conor Bronsdon@ConorBronsdon·
Many agent failures get blamed on the model while the real problem is almost always context or memory architecture that was never well-designed in the first place. Memory engineering is becoming its own discipline: how you store, index, retrieve, and crucially, how your agents forget information. Agent builders are just now catching up. @richmondalake, Director of AI Developer Experience at @Oracle, laid out the full taxonomy on my @chain_ofthought Podcast: just look at how much architecture sits upstream of the context window. The Memory Core handles five data types (vectors, graph, relational, spatial, JSON). The Memory Manager orchestrates retrieval, indexing, storage, and decay logic. Only after all of that does information reach the context window, where token budgeting and composition determine what the model actually sees. Richmond maps this directly to neuroscience. Humans don't have one memory system. We have working memory, episodic memory, semantic memory, and procedural memory, each with its own retrieval and decay rules. Agent memory should work the same way. Perhaps his best line from our conversation: "Don't delete, forget." Information should decay through relevance scoring and importance weighting, not hard deletes. In regulated industries, that distinction is the difference between compliance and a lawsuit. 78% of enterprises have AI agent pilots. Only ~14% have scaled one to production. Too often, we've been optimizing models when we should have been engineering memory. Read about the AI agent amnesia problem: newsletter.chainofthought.show/p/your-ai-agen… And listen to the full episode: chainofthought.transistor.fm/episodes/agent…
Conor Bronsdon tweet media
English
1
4
12
398
Chain of Thought Podcast retweetledi
Conor Bronsdon
Conor Bronsdon@ConorBronsdon·
Memory is the last level playing field in the AI stack. A single developer can build a memory system that outperforms ChatGPT's. @richmondalake (@Oracle's AI DevEx Director) is perhaps the foremost expert on agent memory in the world, and he joined me on @chain_ofthought to break down his approach, talk about his course with @AndrewYNg and give us a live demo. Chapters: Memory is the last battleground in AI Meet Richmond Alake, Oracle's AI DevEx lead Why memory engineering is its own discipline The failure modes nobody talks about Demo: a memory-aware financial services agent Segmenting context windows by memory type Four human memory types mapped to agent architecture Procedural memory in production systems Don't delete, forget: implementing controlled decay Where context engineering ends and memory engineering begins Is agent memory fundamentally a database problem? Files vs. databases: what production actually needs Picking your lane in the AI noise Richmond's courses and where to follow Read more 👇🧵
English
6
2
8
787
Chain of Thought Podcast retweetledi
Conor Bronsdon
Conor Bronsdon@ConorBronsdon·
Agent memory is an underappreciated aspect of AI agents - and understanding how to approach both remembering and actively forgetting information in your agentic systems is crucial to their long-term viability. @richmondalake explained on @chain_ofthought 👇🧵
Conor Bronsdon tweet media
English
2
4
7
1.1K
Chain of Thought Podcast retweetledi
Michel Tricot
Michel Tricot@MichelTricot·
Went on @chain_ofthought with @ConorBronsdon and demoed context poisoning live. One Gong query: 30K extra tokens through raw APIs vs. a fraction through a context store. Agents don't fail because models are bad. They fail because the data feeding them is wrong. Full video in reply
English
2
2
2
167
Chain of Thought Podcast retweetledi
Conor Bronsdon
Conor Bronsdon@ConorBronsdon·
.@AirbyteHQ are betting that agent infrastructure is the future of their business. They built 600+ free data connectors and hit a $1.5B valuation and now they've refocused on agents launching their 'agent engine'. Their CEO Michel Tricot @micheltricot came on the @chain_ofthought podcast to explain why. His take? Context poisoning and fragile agent infra is killing agents and their usefulness. One Gong query through raw API calls can burn 30,000 extra tokens and took 3x the time of the same query through their context store. Scale that up to a production system - or simply run too many of these as an individual - and you've got cost, time, and rate limit problems. Michel and I talked for 45 minutes about how to make our agents more productive 👇 Chapters: 0:00 Intro 0:20 Meet Michel Tricot, CEO of Airbyte 2:27 Data Got Us to the Information Age. Context Gets Us to Intelligence. 4:48 How Context Poisoning Breaks Agents 7:49 Why Airbyte Customers Stopped Loading Into Warehouses 10:12 Live Demo: Context Store vs Raw API Calls 10:38 What Does a Context Engineer Actually Do? 14:14 RAG Isn't Dead, But How We Build It Will Die 16:41 30K Wasted Tokens Without Proper Context 22:22 Cross-System Joins: Zendesk, Gong, and Salesforce 26:12 The Open Source Agent Connector SDK 29:45 The SaaS Apocalypse Is Overblown 36:09 From Data Pipes to Agent Infrastructure 38:51 What Agents Need to Get Right by Summer 40:48 Memory Is Just Another Form of Context 43:07 Outro
English
1
1
1
284
Chain of Thought Podcast retweetledi
Conor Bronsdon
Conor Bronsdon@ConorBronsdon·
.@yujian_tang started the r/AI_Agents subreddit in April 2023. For the first year, it barely moved. Then it hit 9,000 members, he went on vacation, came back to 36,000, and now it's over 300,000. In this @chain_ofthought episode, Yujian talks about how that community grew alongside Seattle Startup Summit (900+ attendees), two failed startups, and why he just filed paperwork to launch his own venture fund. We dig into the mechanics of starting a fund from scratch, why AI startup valuations have doubled in two years, whether a one-person unicorn is realistic, and what failed founders learn that successful ones sometimes miss. Chapters: 0:00 Cold Open: The Subreddit Growth Explosion 0:21 Intro and Meet Yujian Tang 1:06 From AI Research to Community Building 7:26 Where AI Applications Are Headed 10:03 The AI Bubble and a Valuation Reset 10:39 Getting Deal Flow Through Community Events 14:02 Filing the Fund: The Boring Side of VC 16:04 How r/AI_Agents Went from Crickets to 300K 18:39 Building an Accidental Empire 26:37 What Two Failed Startups Taught Him 29:52 Why Pre-Seed Valuations Are Out of Control 37:37 The One-Person Unicorn Debate 39:50 Seattle Startup Summit 2026 42:17 What Chain of Thought Should Cover Next 43:25 Outro
English
1
2
3
406
Chain of Thought Podcast
Chain of Thought Podcast@chain_ofthought·
Meet MARVIN, the AI assistant designed for effective onboarding - so you don't keep having context problems 😅
Conor Bronsdon@ConorBronsdon

Most people blame the model when their AI assistant underperforms. @SilverJaw82 blamed the onboarding. So Sterling built his personal AI assistant MARVIN on Claude Code and actually treated it like a junior employee: he wrote rules, onboarded it, corrected mistakes, and built context over time. 40 days later MARVIN was handling 90% of his workday as his AI chief of staff. He introduced me to Marvin live on @chain_ofthought taking us through a demo of his workflow: from meeting transcripts to Jira tickets to blog drafts, with context carrying across every handoff. Chapters: 0:00 Intro 0:28 Meet Sterling Chin and the MARVIN AI Assistant 9:10 Live Demo: How MARVIN Bookends Your Workday 16:04 Personality, Sub-Agents, and Writing Rules 22:00 Automating Meeting Notes to Jira Tickets 29:30 Why DIY AI Assistants Outperform Big Tech 40:55 Treat Your AI Like a Junior Employee 46:41 How to Get Started with MARVIN 55:36 The Compute Crunch and Open Source Future

English
0
0
1
28
Chain of Thought Podcast retweetledi
Conor Bronsdon
Conor Bronsdon@ConorBronsdon·
Block tripled its headcount from 3,835 to 12,985 in 3 years. Then Jack Dorsey cut 4,000 people and said AI made them unnecessary. The stock jumped 22% A few weeks before the cut, I interviewed Block's VP of AI Tools for my @chain_ofthought Podcast. She showed me live demos of the tools Jack is now citing as his reason. I spent the last 10 days collecting every perspective I could find: - Jack's own defense (5M+ views on his response) - Block's CFO laying out the efficiency math - Cash App's head of design sharing production metrics - Sam Altman calling it "AI washing" - A Wharton professor saying the efficiency claims don't add up - A data scientist who survived, got offered a 90% raise, and quit The tools are impressive. The bloat was there. The stock incentive is locked in. And 4,000 people are job hunting from a company that was growing and profitable. Anyone telling you this is simple is selling you something. I wrote a full, 6,600 word definitive analysis in my newsletter 👇
Conor Bronsdon tweet media
English
2
2
2
129
Chain of Thought Podcast retweetledi
Conor Bronsdon
Conor Bronsdon@ConorBronsdon·
Intercom just raised $250M for @Fin_ai at a $2B+ valuation 🤑 Two weeks ago, @intercom's Chief AI Officer @fergal_reid came on @chain_ofthought and broke down exactly how they got here - including ditching GPT-4 for fine tuned Qwen models for some tasks to save $250K/month, and a ton of great info on how they think about building and growing Fin. Chapters: 0:00 Intro 0:46 Why Intercom Completely Reversed Their Fine-Tuning Position 8:00 The $250K/Month Summarization Task (Query Canonicalization) 11:25 Training Infrastructure: H200s, LoRA to Full SFT, and GRPO 14:09 Why Qwen Models Specifically Work for Production 18:03 Goodhart's Law: When Benchmarks Lie 19:47 A/B Testing AI in Production: Soft vs. Hard Resolutions 25:09 The Latency Paradox: Why Slower Responses Get More Resolutions 26:33 Why Per-Customer Prompt Branching Is Technical Debt 28:51 Sponsor: Galileo 29:36 Hiring Scientists, Not Just Engineers 32:15 Context Engineering: Intercom's Full RAG Pipeline 35:35 Customer Agent, Voice, and What's Next for Fin 39:30 Vertical Integration: Can App Companies Outrun the Labs? 47:45 When Engineers Laughed at Claude Code 52:23 Closing Thoughts Fergal shared an excellent playbook for when to fine tune vs. use frontier models, how to scale an AI team from 10 to 100+, and what it actually takes to defend an application layer company. Watch here - or listen at the links below 👇
English
1
2
4
674