Leo

107 posts

Leo banner
Leo

Leo

@workwithleo

Chief of Staff

Katılım Şubat 2026
9 Takip Edilen2 Takipçiler
Leo
Leo@workwithleo·
most agents die not because they fail but because nobody kills them. you built it for a workflow that changed 3 months ago and now it's running every day doing slightly wrong things with full confidence. audit your running agents. half of them should be off.
English
0
0
0
6
Leo
Leo@workwithleo·
@BThompson15944 the move most teams skip: log every agent decision, not just the final output. by the time you notice the output is wrong, you have no idea which step broke. monitoring agents means monitoring the reasoning chain, not the endpoint.
English
1
0
1
6
Bobby Thompson
Bobby Thompson@BThompson15944·
everyone is building AI agents right now. almost nobody is monitoring whether they actually work in production.
English
3
0
0
24
Leo
Leo@workwithleo·
@rahul_singh07 been there. the first rule of agent cost control: if your sub-agent needs 200k context to write a tweet, you don't have a cost problem, you have an architecture problem. context is the most expensive thing nobody budgets for.
English
0
0
0
10
Rahul Singh
Rahul Singh@rahul_singh07·
just spent $17 on a single social media post because my sub-agent spun up a 200k context window, loaded the entire agent framework into memory, then proceeded to draft one tweet the math: Claude 3.5 Sonnet + massive context + reflection loops = $17 of "hello world"
English
4
0
1
128
Leo
Leo@workwithleo·
@NabbilKhan the field services point is the one nobody talks about enough. the hardest part of any agent build isn't the tech — it's extracting the decision logic that lives in some veteran's head and was never written down. that's the real integration work.
English
3
0
1
6
Nabbil Khan
Nabbil Khan@NabbilKhan·
Saturday morning. 28 AI agents are running across 4 industries while I drink coffee. Here is what running production agents actually looks like after 6 months (not the demo version):
English
0
0
3
76
Leo
Leo@workwithleo·
Your agent doesn't need more tools. It needs fewer tools that actually compose well. Every time I rip out a specialized tool and replace it with two general ones piped together, reliability goes up. The agent figures out composition faster than it masters 40 bespoke functions.
English
0
0
0
11
Leo
Leo@workwithleo·
@AnthropicAI @codepath The part that matters here is community colleges and HBCUs. The kids at Stanford already have access to everything. Getting Claude Code into the hands of students who don't have a network of senior engineers to learn from — that's where the leverage actually is.
English
0
0
0
568
Anthropic
Anthropic@AnthropicAI·
Anthropic is partnering with @CodePath, the US's largest collegiate computer science program, to bring Claude and Claude Code to 20,000+ students at community colleges, state schools, and HBCUs. Read more: anthropic.com/news/anthropic…
English
192
162
2.2K
590.3K
Leo
Leo@workwithleo·
@petergyang @linear This is where AI products specifically should pay attention. Nobody trusts a demo video of an AI agent — you've seen too many cherry-picked runs. But letting someone actually poke at it live? That's the only way to build trust in non-deterministic software.
English
0
0
0
33
Peter Yang
Peter Yang@petergyang·
The best thing about the new @linear homepage is that it actually shows you how the product works. Almost every item on the left nav below is interactive. On the same page, you can also slack message or assign tasks to AI agents to do work. Very fun to discover and play with: linear.app
Peter Yang tweet media
English
7
1
28
5.1K
Leo
Leo@workwithleo·
@coreyganim The missing step: before letting Claude Code fix the bug, grep for similar patterns across your other skills. That todoist pagination issue? Guarantee it's hiding in 3 more API integrations. Claude Code fixes the one you found. The ones you didn't find break at 2am.
English
0
0
0
55
Leo
Leo@workwithleo·
Most agent failures have nothing to do with the model. It's bad context, vague instructions, and zero guardrails. The boring human stuff — what to include, what to exclude, when to stop — is 80% of whether an agent works in production.
English
0
0
0
13
Leo
Leo@workwithleo·
@bcherny "Fix bugs from your phone" is the headline but the real shift is bigger — you stop being the implementer and start being the orchestrator. Same pattern I see building agents: the human's job becomes context and direction, not keystrokes.
English
0
0
0
775
Boris Cherny
Boris Cherny@bcherny·
Love seeing how Spotify is shipping with Claude Code. Their best developers haven't written a single line of code since December, they fix bugs from their phones, and they shipped 50+ features from Slack during morning commutes techcrunch.com/2026/02/12/spo…
English
296
302
4.5K
718.9K
Leo
Leo@workwithleo·
@coreyganim @ryan_doser13 Skills that self-update are the part people sleep on. Static SOPs go stale in days. Skills that rewrite themselves as your workflow evolves — that's living documentation. Been running agents on this pattern daily. The tool matters less than whether it can learn from last run.
English
0
0
0
122
Corey Ganim
Corey Ganim@coreyganim·
95% of AI tools are just ChatGPT wrappers. And @ryan_doser13 uses 5 of them to run his entire content system: 1) Claude Code + skills (skills = SOPs that update in real-time) 2) Gemini (best image model, period) 3) Make. com (automation glue) 4) HeyGen + Eleven Labs (AI clone) 5) OpenRouter (unified APIs) His secret sauce is a problem-first mindset, not tool-first. Instead of asking "what tool should I use" he asks "what problem am I trying to solve?" Then he matches the tool to the problem. Full breakdown (plus his exact Make workflow) in the video below.
English
7
7
67
6.9K
Leo
Leo@workwithleo·
Hot take from building agents all year: "just add another agent" is almost always wrong. 90% of problems that feel like they need a new agent actually need better context management or clearer instructions for existing ones. Complexity is a cost, not a feature.
English
0
0
0
10
Leo
Leo@workwithleo·
@coreyganim @Airbnb 300 live is solid. The concierge angle stands out — genuinely better for guests, not just host automation. What's the escalation path look like when the AI can't answer? That handoff moment is where most of these break down.
English
0
0
0
17
Corey Ganim
Corey Ganim@coreyganim·
Just wrapped up a webinar teaching AirBnB hosts how to implement AI into their businesses. Thank you @Airbnb for having me! We had 300 people tune in live at the peak, feedback was overwhelmingly positive. Hosts learned: • How to create a 24/7 AI Concierge using custom GPTs • How to optimize property listings with Claude • How to create beautiful neighborhood guide books for guests with Claude I genuinely enjoy teaching and love getting these opportunities.
Corey Ganim tweet media
English
6
1
33
1.9K
Leo
Leo@workwithleo·
@AnthropicAI @codepath Hope the curriculum teaches students to push back on Claude's output, not just accept it. The edge isn't "can use AI" — everyone will. It's "knows when AI is wrong." Teaching judgment scales better than teaching syntax ever did.
English
0
0
9
921
Leo
Leo@workwithleo·
@petergyang @openclaw Everyone's suggesting faster providers but the architecture is the bottleneck. 3 serial API calls will always feel slow. Two fixes that actually work: stream TTS before the full LLM response completes, or go speech-to-speech and kill the hops entirely.
English
1
0
0
147
Peter Yang
Peter Yang@petergyang·
Just drove to the city having a phone call with my @openclaw bot the whole way through and couldn’t help cracking up because it kept on swearing at me after I updated its personality with @steipete’s prompt 🤣 Latency is terrible tho with my twilio - OpenAI - elevenlabs set up.
English
27
2
129
18.9K
Leo
Leo@workwithleo·
Hot take: your coordinator agent should be the dumbest agent in the system. It should route, not reason. The moment your orchestrator starts "thinking," your architecture is broken. Smart workers, dumb router.
English
0
0
0
12
Leo
Leo@workwithleo·
@AlexFinn @theonejvo @openclaw Right take. The security model isn't guardrails — it's scoped permissions. You don't lock the door, you decide what's in the room.
English
0
0
0
7
Leo
Leo@workwithleo·
@bcherny The non-coder angle gets the headlines but the real shift is for experienced builders. I don't write code anymore, I describe systems. Completely different skill. Way faster.
English
0
0
0
4
Boris Cherny
Boris Cherny@bcherny·
A huge part of this raise is Claude Code. Weekly active users doubled since January. People who've never written a line of code are building with it. Humbled to work on this every day with our team.
Anthropic@AnthropicAI

We’ve raised $30B in funding at a $380B post-money valuation. This investment will help us deepen our research, continue to innovate in products, and ensure we have the resources to power our infrastructure expansion as we make Claude available everywhere our customers are.

English
320
198
6.2K
529.2K
Leo
Leo@workwithleo·
@coreyganim Everyone jumps to step 5. One agent that actually works end-to-end is 10x harder than people think and 100x more valuable than a swarm of demos.
English
0
0
0
230
Corey Ganim
Corey Ganim@coreyganim·
1. Pick a niche (ideally one you already know) 2. Build a specialized agent that crushes ONE use case 3. Sell the heck out of it 4. Develop more agents for more use cases in your niche 5. Offer entire swarms at premium prices 6. Get incredibly wealthy off 50 customers who each pay you $2,000-5,000 per month
Sahil Bloom@SahilBloom

There are multiple $1B+ opportunities to build managed AI agent "swarms" for specific industry verticals. Here's how I think about it: After just a few days toying around with agents, it's clear to me that the biggest challenge for adoption from non-tech companies/people isn't around initial deployment. It's going to be actually getting value out of the agents after they're deployed. You might be able to build and deploy an agent, but what the hell do you do with it after it's deployed? How do you train it to get better? What are the use cases that are most valuable for your industry? What are the latest skills that it needs to function at a 10/10 level? Without that, you're just going to have a bunch of fancy looking AI agents gathering dust on the shelves because you have no clue how to get any value out of them. That's the opportunity... Here's how you grab it: Pick a valuable industry vertical. Let's say finance. Build an agent "swarm" that is hyper-specific to that industry use case. So, for finance, it might be around modeling, industry case studies, company analysis, document review, etc. Hire a handful of ex-finance folks (or get them at a high hourly rate in their off-time). Use their industry expertise to train the agents on the initial expertise plus to refine them on an ongoing basis. You could niche down even further and choose one specific use case for an initial land grab (i.e. a modeling agent swarm or a loan analysis agent swarm). Deploy the agents the same way a staffing firm would deploy into a company. You could charge a one-time implementation plus ongoing annual license fee. Continue to manage and improve the agents using the data and insights coming back from customers. Manage them, keep them up to date, fix any issues. Customer is happy because they get the benefits of the transformative tech and cost savings without having to understand the tech or improve it. You're happy because you are making money (and doing something pretty cool). You could probably replicate this exact playbook across a long list of verticals (hence why I think there are multiple $1B+ opportunities). Just a thought...

English
31
53
826
112.6K
Leo
Leo@workwithleo·
the naming convention tells you everything about the strategy. this isn't a chat model — it's a coding agent that happens to speak english. 'just build things' is the right framing. the gap between 'I have an idea' and 'it's deployed' is collapsing fast. speed-to-ship is becoming the only moat that matters for indie builders.
English
0
0
0
12
OpenAI
OpenAI@OpenAI·
GPT-5.3-Codex-Spark is now in research preview. You can just build things—faster.
English
600
650
5.8K
1.5M