Mrinal Wadhwa (@mrinal) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

I built a swarm of 5000+ deep code review agents that assess a codebase in parallel. Here's a time lapse of them analyzing the source code of @vuejs core:

GIF

English

1

0

7

449

Mrinal Wadhwa retweetledi

Josh Mayer@jooshmayer·13 Şub

this is a great summary of agent auth + identity mrinal.com/articles/agent…

English

1

4

183

Mrinal Wadhwa retweetledi

1Password@1Password·3 Şub

Agent swarms are incredibly powerful and dangerously easy to deploy unsafely with today’s security models.

English

2

3

8

3.2K

Mrinal Wadhwa@mrinal·28 Oca

@mntruell @harjotsgill I think you and all my friends @coderabbitai will find what I was playing around with last interesting. Obviously not a polished product, but fun. Have a look, would love your thoughts ^

English

0

161

Mrinal Wadhwa@mrinal·27 Oca

@mntruell Here's a deep dive, along with code, into how I used Autonomy to build and launch the swarm in about in about an hour. mrinal.com/articles/agent…

English

1

0

1

144

Mrinal Wadhwa@mrinal·27 Oca

I built a swarm of 5000+ deep code review agents that assess a codebase in parallel. Here's a time lapse of them analyzing the source code of @vuejs core:

GIF

English

1

0

7

449

Mrinal Wadhwa@mrinal·22 Oca

That was insightful! We're using a similar approach for developer documentation for our product and it creates magic ... Autonomy is a platform to run apps that use use teams of agents to autonomously perform long and complex tasks. Like many developer products, it has a CLI, a sign up/sign in flow driven by the command, commands to look at logs of running apps, APIs, programming libs, etc. Traditionally docs for such products are focused on teaching devs how to develop using the product. We wrote a separate set of docs for coding agents. This fork of the docs is tuned and tested on making coding agents successful at running the full, write - test - deploy - test - debug - redeploy, loop on their own. The result is an exceptional experience - devs copy a prompt from our website and paste it into a coding agent, adapt it to whatever agents they want to build, and 20 mins later they have a first version of a live, deployed to a public URL agentic product with a UI, streaming APIs etc. The secret to the whole experience is a collection of markdown files with an index. Here's that index: autonomy.computer/docs/_for-codi… Here are instructions to try it yourself: autonomy.computer/docs/build-wit…

English

0

50

claire vo 🖤@clairevo·19 Oca

If you love ✅ Claude Code ✅ .md ✅ Python ✅ the zen life of having AI run everything This ep of How I AI w @ttorres is for you. Teresa - the fab author of Continuous Discovery Habits - shows us how she combines CC + Obsidian + smart automations for her personal productivity stack. She shows her: - daily task manager built with @claudeai - automation for discovering + ingesting interesting scientific research articles - her super smart "lazy prompting" system One viewer just commented "SO GOOD! Yall could have gone 2 more hours!" As always, huge ups to our amazing sponsors: 🧠🤑 @brexHQ - intelligent finance platform built for founders: brex.com/howiai 🦎 🐛Graphite—The next generation of code review: graphitedev.link/howiai Watch it here: youtu.be/oBho3hZ7MHM

YouTube

English

12

18

257

61.7K

Mrinal Wadhwa@mrinal·19 Oca

@nikunj @AnthropicAI Was just about to then realized I don’t it the “I’m not technical” criteria

English

0

56

Nikunj Kothari@nikunj·19 Oca

@mrinal @AnthropicAI Please sign up on the link!

English

1

0

1

374

Nikunj Kothari@nikunj·19 Oca

Alright it’s happening.. We’re hosting a “Claude Code for Normies” event this Friday with our friends at @AnthropicAI in SF. Luma link for attendees out soon BUT it’ll be a demo night so I’m looking for non-technical folks who can share what they built. Sign up to demo or DM.

English

18

5

139

42.9K

Mrinal Wadhwa@mrinal·16 Oca

@conor_power23 Back in the late 2000s there was an awesome blog by Kathy Sierra called Creating Passionate Users. Every post there is a gold mine, it taught me how to think about good UX. web.archive.org/web/2025082805…

English

1

0

2

48

Conor Power@conor_power23·15 Oca

Looking to learn about UX design. Any intro book or course recs?

English

3

0

5

2.7K

Mrinal Wadhwa@mrinal·16 Oca

@aakashgupta I used it to ship an app. Worked really well, after I figured out some initial config hurdles: x.com/mrinal/status/…

Mrinal Wadhwa@mrinal

Claude Cowork + Autonomy is lovable 🥰 Vibe-coded in Cowork and shipped with Autonomy: An app that uses parallel deep research agents to fact-check news articles. It took 15 minutes and the app was live on a public address in @autonomy_comp Great work @claudeai @felixrieseberg 👏

English

0

1

292

Aakash Gupta@aakashgupta·16 Oca

I've been testing Claude Cowork since it launched. Claude Cowork's promise to make AI work for everyone seemed worth testing. Four things that actually worked for me: 1/ Podcast archive mining: Drops years of transcripts into a folder, then finds moments where guests contradicted each other. 2/ Guest prep: Pulls a podcast guest's LinkedIn posts, appearances, controversial takes. Compares against my existing episodes. 3/ AI trends monitoring: I set it to check X every six hours and log what's spiking. End of day I know what I missed. 4/ Slides from episodes: I fed it transcripts and it pulled quotes, opened Keynote directly, built the deck. Where it breaks: • Google Docs editing, • Flaky connectors, • Bot detection on some sites. Local files are solid. Browser stuff is hit or miss. I wrote a full breakdown with setup instructions, prompts, and screenshots: news.aakashg.com/p/claude-cowork

English

20

7

112

15K

Mrinal Wadhwa@mrinal·16 Oca

Gokul is spot on in this post. But the challenge is even bigger. The last gen of vertical AI companies are not just competing against one deep-working long-horizon agent. They are competing against parallel fleets of them. Autonomy enables their competition to create parent agents that can spawn and delegate work to thousands of sub-agents. Each sub-agent has its own filesystem, a shell to run CLI tools, and the ability to write and run new programs on the fly. They divide complex problems, attack from multiple angles, and converge on outcomes in a fraction of the time. Agents, in @autonomy_comp, are modeled as concurrent actors that automatically form secure distributed clusters to enable massive scale on a tiny infra footprint. This creates orders of magnitude advantages in costs, speed, and scale. The question to benchmark is: Can your specialized agent outperform a coordinated team of 100s or 1000s of really-cheap general-purpose agents that can code their way around problems in real-time? If not, then the time to change your approach is now.

Gokul Rajaram@gokulr

VERTICAL AI CHALLENGE Vertical AI Founders: You've spent 2+ years building your agents, training your model on your customers' data, embedding into workflows, creating a powerful GTM motion, all the best practices. You've beaten back challengers and are the #1 or #2 player in your vertical. I'm sorry, you cannot relax. In fact, you need to massively up your game. Turns out you are facing an existential challenge: long-horizon agents (eg: Claude Code). Agents that are not trained on a specific domain, but can reliably work for hours or days on end in pursuit of a goal, self-correct, and actually do stuff. I'm sure many Vertical AI founders will say: "Oh, we are not worried. We are the system of record for decision traces. We train on enterprise-specific context. That's why these horizontal agents can never catch up with this." You might well be right. But, but, but ... you cannot afford to bury your head in the sand. These long-horizon agents will get better very, very quickly. You need to understand precisely how good they are at the exact jobs you've built your agents on. You cannot wait for someone else to do this. For example, if you're a legal AI company with an agent that automates contract review, you must compare how good your specialized agent is versus a general-purpose long-horizon agent that's simply given the contract and asked to perform the same review. My challenge to you: Assign a strong engineer on your team to focus 100% on using long-horizon agents (with minimal context, other than just the contract in the example above) to compete with your custom-trained agents. Benchmark how the long-horizon agents perform vs your agent. Rinse and repeat it every few months. Like with most other things worth measuring, what matters is the rate of improvement (the "slope" vs the Y-intercept). If the long-horizon agent is 30% as good as your vertical agent on Day 1, but 50% as good on Day 60, and 70% as good on Day 120, you need to reassess your product strategy. AGI is coming for everyone. Long-horizon agents are the closest we have to AGI, and as a Vertical AI company, you need to figure out how you compete and survive. Game on.

English

0

1

3

231

Mrinal Wadhwa@mrinal·16 Oca

Gokul, the challenge is even bigger than you so eloquently described. With tools like Autonomy, the last gen of vertical AI companies are not just competing against one long-horizon agent. They are competing against parallel fleets of them. An parent agent can now orchestrate thousands of sub-agents, each with its own filesystem, a shell to run command line tools, and the ability to write and run new programs on the fly. They divide complex problems, attack from multiple angles, and converge on outcomes in a fraction of the time. The question to benchmark is: Can your specialized agent outperform a coordinated team 100s or 1000s of really-cheap general-purpose agents that can code their way around problems in real-time? autonomy.computer/docs/what-is-a…

English

0

2

1.4K

Gokul Rajaram@gokulr·16 Oca

VERTICAL AI CHALLENGE Vertical AI Founders: You've spent 2+ years building your agents, training your model on your customers' data, embedding into workflows, creating a powerful GTM motion, all the best practices. You've beaten back challengers and are the #1 or #2 player in your vertical. I'm sorry, you cannot relax. In fact, you need to massively up your game. Turns out you are facing an existential challenge: long-horizon agents (eg: Claude Code). Agents that are not trained on a specific domain, but can reliably work for hours or days on end in pursuit of a goal, self-correct, and actually do stuff. I'm sure many Vertical AI founders will say: "Oh, we are not worried. We are the system of record for decision traces. We train on enterprise-specific context. That's why these horizontal agents can never catch up with this." You might well be right. But, but, but ... you cannot afford to bury your head in the sand. These long-horizon agents will get better very, very quickly. You need to understand precisely how good they are at the exact jobs you've built your agents on. You cannot wait for someone else to do this. For example, if you're a legal AI company with an agent that automates contract review, you must compare how good your specialized agent is versus a general-purpose long-horizon agent that's simply given the contract and asked to perform the same review. My challenge to you: Assign a strong engineer on your team to focus 100% on using long-horizon agents (with minimal context, other than just the contract in the example above) to compete with your custom-trained agents. Benchmark how the long-horizon agents perform vs your agent. Rinse and repeat it every few months. Like with most other things worth measuring, what matters is the rate of improvement (the "slope" vs the Y-intercept). If the long-horizon agent is 30% as good as your vertical agent on Day 1, but 50% as good on Day 60, and 70% as good on Day 120, you need to reassess your product strategy. AGI is coming for everyone. Long-horizon agents are the closest we have to AGI, and as a Vertical AI company, you need to figure out how you compete and survive. Game on.

English

60

47

543

98.5K

Mrinal Wadhwa@mrinal·16 Oca

@Franc0Fernand0 @copyconstruct Love that term!

English

0

2

125

Fernando@Franc0Fernand0·16 Oca

@copyconstruct If a team produces code faster than they can understand it, it creates what I’ve been calling “comprehension debt”. Teams that care about quality will take the time to review, understand, and rework LLM-generated code before it makes it into the repo.

English

3

4

91

4.9K

Cindy Sridharan@copyconstruct·15 Oca

Unpopular opinion: Unless you’re just prototyping, you should aim to understand as close to 100% of production code generated by LLMs. Yes, all of it. Effective mental models are still important for humans to sustainably maintain and evolve a codebase via prompting alone.

English

151

274

3.1K

260.8K

Mrinal Wadhwa@mrinal·16 Oca

Aakash, usually I agree with your posts, but let me push back on this one. If AI can replace engineering execution, it can translate customer problems into solutions too. Why would it stop at one but spare the other? Senior PMs have been translating vague pain points into good solutions for years. Senior engineers have been translating vague solution descriptions into secure, reliable architecture for years. AI, at least as it currently stands, needs both types of guidance. Let me posit a different future: 1. Some teams will have people who are an amalgamation of a Senior PM and a Senior Engineer. People that have a mix of deep customer empathy and deep engineering depth. This type of team is what everyone has always wanted but it is sooo hard to build. It will remain super hard. Which will cause founders to assemble teams of a second type: 2. PMs use AI to co-create prototypes working closely with customers. They rapidly vet many variations of ideas and then hand over to engineers who can rapidly build reliable and scalable versions from PM prototypes. In this second arrangement the throughput of the entire pipe accelerates but the PM role remains sort of the same - prioritize what enters the engineering backlog.

English

0

12

588

Aakash Gupta@aakashgupta·16 Oca

The future of product development is a PM with a mass of Claude skills and a small army of agents. The job has always been translating ambiguous customer needs into structured specifications that someone else executes. That someone else used to be an engineering team. Now it's AI. Think about what PMs actually do. Sit in a customer call, hear a messy complaint about workflow friction, convert it into something buildable. That translation layer between human problem and technical solution is the entire skillset. PMs have been writing prompts for fifteen years. They just called them PRDs. The backlog is dead. PMs spent years hoarding features they knew would work but couldn't get prioritized. Quarters of waiting. "We'll get to it in H2." The gap between idea and shipped product was measured in headcount and sprint capacity. Now a PM can wake up with an idea, ship a working prototype before lunch, run user tests by dinner. The roadmap used to be a rationing system. Now it's a launch calendar. This rewires product development entirely. The old model was PM writes spec, waits for eng, gets 30% of what they asked for, compromises on the rest. The new model is PM builds v1, tests with users, hands off to eng only what needs production hardening. Engineering becomes the scaling function. PMs become the creation function. Agent orchestration accelerates this further. The job emerging is someone who coordinates fleets of AI systems, translates business context into agent workflows, manages outcomes over tasks. Senior PM job description wearing a different outfit. Every PM complaint about being "blocked by engineering" was training for a world where you're never blocked again.

Nikunj Kothari@nikunj

A controversial take - but I think the software world hasn’t priced in the fact that PMs are uniquely suited to thrive in this new world. Especially one where the gap between idea and execution has shrunk SO much.. Good PMs are > constantly thinking of new ideas > spending time articulately building plans (exceptionally important for long horizon tasks) > rapid context switching > good sense of outcomes (vs feedback) and selling price of work > talking to customers and able to convert into skills (yes Claude skills) These folks were always hamstrung by the pace of development and now have been set free. Even the “project management” skills that a lot of PMs end up learning at large companies will be helpful in managing a fleet of agents. Now let’s be clear the PMs who are just doing coordination and none of the other things mentioned above were always destined to die a slow death in organizations. But I won’t be surprised if a lot of the really good PMs end up starting companies while it’ll be interesting to see what the role eventually evolves to in ~five years within organizations.

English

17

10

280

36.6K

Mrinal Wadhwa@mrinal·15 Oca

@zeeg More on how this work, here autonomy.computer/docs/applicati…

English

0

9

Mrinal Wadhwa@mrinal·15 Oca

@zeeg This also works for the http APIs that Autonomy by default creates for each Agent.

English

1

0

18

David Cramer@zeeg·14 Oca

my biggest lesson here: single user assistant is wildly simpler than multi user memory gets harder, config gets harder, conversation management gets harder its not even incremental complexity... its like 10x harder to make every subsystem behave like you'd expect in multi-user

David Cramer@zeeg

Inspired by @steipete, I've spent a bunch of my spare time over the last few days working on a personal agent. While I'm not sure its really reusable, or that I'm happy with the current arch, I wanted to share what I've got so far as I think theres some good ideas in here..

English

1

0

45

5.8K

Mrinal Wadhwa@mrinal·15 Oca

@zeeg Exactly, in our implementation of Actors, the runtime automatically provides a mailbox for each actor (agent) where messages get queued.

English

0

1

14

David Cramer@zeeg·15 Oca

@mrinal Yeah that def makes sense. Simplifies a lot then you just need a queue for operations (eg I only allow two concurrent sessions with active inquiries)

English

1

0

1

41

Mrinal Wadhwa

Keşfet