Manish Sharma

33.5K posts

Manish Sharma

@msharmas

CTPO CoFounder Caladrius https://t.co/9tGLzTelYO | Founder, Director, Yajur Healthcare | https://t.co/fiFNOhnNUW | https://t.co/JaLIGNFGvX

Bangalore, Karnataka, India Katılım Mayıs 2009

6.5K Takip Edilen6.1K Takipçiler

Sabitlenmiş Tweet

Manish Sharma@msharmas·6 Tem

Be Patient. Progress Takes Time.

English

905

Manish Sharma retweetledi

Andy Fang@andyfang·13h

Today, we're launching a new suite of AI-powered tools to help our merchants grow their sales. Some highlights: 1) Onboarding that reads your website with AI and sets you up on DoorDash automatically, and 35%+ faster 2) AI photo editing to make every menu photo look professional 3) AI-built websites to power your online ordering in minutes 4) Marketing campaigns that read write and send themselves, AI-optimized to be the most relevant to customers

English

381

146.6K

Manish Sharma retweetledi

Arch Valmiki@archvalmiki·1d

@adamghowiba Full talk. Tho this is a year old, pre-step-upgrade in models that happened in Dec/Jan. So some items might not apply youtu.be/yMalr0jiOAc

YouTube

English

179

60.7K

Manish Sharma retweetledi

Adam Ghowiba@adamghowiba·1d

JP Morgan's investment research team just shared exactly how they built their multi-agent system "Ask David", and it's the same architecture pattern showing up everywhere: - supervisor agent orchestrates - specialized subagents handle retrieval, structured data, analytics - LLM-as-judge reflection node before the answer ships - human-in-the-loop for the last accuracy gap worth watching for anyone building:

English

133

620

6.4K

1.8M

Manish Sharma retweetledi

Karl Mehta@karlmehta·12h

Anthropic's MCP team explains the same layer shift: models do not interact with APIs directly. They interact with prompts, tools, and context. That is why the harness around the model becomes so important.

Karl Mehta@karlmehta

x.com/i/article/2050…

English

176

56.8K

Manish Sharma retweetledi

sesh@seshadrinithin·2d

Banger after banger from @JayaGup10 I think this raises a unique opportunity. Separate context, memory, and harness from the model infrastructure and offer it to companies i.e., services as a product offering with the infrastructure for systems to build systems. Easier said than done but the opportunity exists nevertheless.

Jaya Gupta@JayaGup10

x.com/i/article/2042…

English

166

65K

Manish Sharma retweetledi

Aakash Gupta@aakashgupta·2d

Six months ago, the longest task you could trust an AI agent to finish without supervision was three minutes. Today it's hours. Soon it's a workday. Models didn't get that much smarter. They finally got the scaffolding every functioning human has: memory of the world, tools to act on it, and guardrails that keep them from doing dumb things. Mahesh, who's been building AI products at Microsoft, Amazon, Meta, and Google for 13 years, explains it through how a kid learns. First you teach them hot and cold. That's signals. Then they can fetch a glass of water. That's tools. Then they learn what's allowed. That's guardrails. The intelligence was always there. The scaffolding is what made the kid useful. Claude Code is the moment the scaffolding got production-ready. File system access, bash, long-horizon jobs, skills written in plain English. The whole stack a working adult uses, ported to an agent. Most PMs are still writing prompts like they're talking to a smart toddler. The PMs pulling ahead are raising agents that remember what happened yesterday and follow the house rules today. Stop prompting. Start raising.

Aakash Gupta@aakashgupta

This guy literally broke down how to become a $1.4M "builder PM" with n8n, Claude Code, and OpenClaw: 1:53 - What a "builder PM" actually is 6:04 - Your first agent in n8n (live build) 14:18 - Why every agent needs these 4 things 21:35 - The multi-agent eval loop 29:47 - Where n8n dies 33:39 - When to graduate to Claude Code 35:08 - What broke in December 2025 47:17 - The self-improving PRD reviewer 1:02:28 - Mocks and prototypes without designers 1:05:15 - OpenClaw and the new agent OS 1:22:06 - What AI PM interviews look like now

English

8.6K

Manish Sharma retweetledi

Aakash Gupta@aakashgupta·2d

The PM role got rewritten in production while half the industry was still debating whether PMs should code. Marcus Moretti runs Spiral at Every as a one-person team. PM, code, customer support, marketing. His read on the role is in this guide and it rewards reading slowly. He doesn't write tickets anymore. His agent writes them, moves them around the board, keeps statuses live. PRDs, sprint planning, backlog grooming, stakeholder updates are all gone. What replaced 60% of a PM's old week is two files and a cron job. The whole workflow: - strategy.md defines target problem, approach, who it's for, 3-5 SMART metrics, and 2-4 work tracks. Rerun the interview every few months. - /ce:product-pulse runs at 8am daily, reads PostHog, Stripe, Datadog, and the database, and writes a 30-40 line briefing to ~/pulse-reports/. - Now/Next/Later kanban. In Progress, Done. No sprints, no standups. The constraint most teams aren't repricing: vendors without MCPs become unusable in this workflow. Marcus's words on a tool his predecessor was paying for: "didn't have an MCP, and it was swiftly cancelled." The average company runs 100 SaaS subscriptions. A meaningful slice of those just got a 12-month death timer. The PM job description follows the same logic. Whatever the agent can't read, the PM can't use. Whatever the PM can't use becomes someone else's job. Read it twice.

Dan Shipper 📧@danshipper

must read Marcus went from product manager to shipping product like a madman @every with coding agents he wrote the definitive guide for how to do it: every.to/guides/ai-prod…

English

268

105.8K

Manish Sharma retweetledi

Aakash Gupta@aakashgupta·3d

For 30 years, PMs wrote Jira tickets and engineers wrote code. A 21-agent dev team inside Claude Code just inverted that workflow. The spec is now the bottleneck. Clarity of what you actually want determines everything downstream. This is why "classic PM skills > vibe coding" lands harder than it sounds. Vibe coding without a real spec produces an MVP that crashes the second a user touches it. A 21-agent team multiplied by a sloppy prompt produces 21 agents worth of slop. The system analyst agent is the tell. It sits at the front of the workflow doing pure thinking work. The entire stack falls over without it. A working hockey rules app hit TestFlight inside 2 hours and 13 minutes of timestamps. Idea, Figma, Jira, shipped code. One person. The leverage moved to whoever writes the clearest specs. That's the whole job now.

Aakash Gupta@aakashgupta

This guy literally broke down everything to build a 21 agent dev team in Claude Code: 0:00 - Why one prompt = AI slop 1:55 - Claude Code as your full startup 2:43 - The 21-agent team 4:57 - Inside the system analyst agent 5:52 - Live demo: 0 → TestFlight 8:42 - Define a "good" system analyst first 11:53 - The system analyst workflow 12:17 - Why Confluence + Jira + MCPs matter 15:30 - Classic PM skills > vibe coding 19:15 - The scaffolding that kills spaghetti code 22:10 - Setting up agents inside Claude Code 26:19 - The longest dictation prompt this podcast has seen 35:29 - Why dictation beats typing for AI specs 47:30 - Brand guidelines in Figma Make 55:59 - Idea → prompt → design → app 1:06:27 - Claude Code builds the Figma screens 1:23:59 - Frontend tickets appear in Jira (with Figma links) 1:48:49 - The hockey rules AI app goes live 1:53:56 - Full recap: Claude, Confluence, Figma, Jira, Simulator, TestFlight 2:03:17 - Should PMs get AI PM certificates? 2:08:15 - How to build a PM portfolio that lands FAANG offers 2:13:25 - How to get started this week

English

Manish Sharma retweetledi

Aaron Levie@levie·4d

As agents become the biggest users of software, then all software has to be available in a headless fashion. Agents won’t be using your UI, they’ll be talking to your APIs. So the question becomes what is the business model of software and this headless approach in the future? Here are a few thoughts on how everything plays out based on what we’re seeing and doing at Box, but also conversation with other platforms. 1) Seats don’t go away for *people*. Seats are still a convenient and efficient way to have a customer use technology predictably for a set of users within a baseline set of usage. The key, though, is that when the customer pays for a seat, it has to come with a set of usage of APIs on behalf of that user that the agent can use on their behalf. The user will need to be able to interact with their data and the underlying tool via any agent they work with, and an embedded amount of usage will come with the seat. I would imagine most software -Box included- will enable seats to work with their data at a relatively high volume via systems like ChatGPT, Codex, Claude, Gemini, Cursor, Copilot, Perplexity, Factory, Cogniton, et al. quite seamlessly. If you don’t do this, you’re DOA. 2) Agents may have “seats” if they are doing stateful work in the system, but they will be priced very differently than people. Seats (or the equivalent) can make sense when you have an agent that has its own workspace, stores its own data, needs a different set of permissions compared to the user, and so on. If a company wants this agent to be around for long period of time, that may very well look like another “user” in the system. Openclaw-style agents highlight what this future could look like. The only issue on pricing here is that one customer could decide to do all their work in 1 agent, and another might split it into 1,000 agents. So pricing like a human seat is nearly impossible and impractical; each company will have a different approach for this as it gets tricky perfectly trying to capture all the value within an agent seat. 3) The dominant pricing for headless use that goes above the seat allotment, or when an agent is firmly acting on their own, will be a consumption model. Many enterprises software platforms have previously operated like this with PaaS options, and agents will look like another machine user of their system. In some cases the APIs might get priced just as they did previously, but in other cases there may need to be new types of APIs that represent the work an agent would do in one go -more akin to an outcome- instead of a series of API calls. This is especially germane when the headless software also has an agentic use-case embedded within in, such as orchestrating the process within their own system via AI. Overall the growth of this usage pattern is effectively unbounded as the use-cases for agents operating on data in these systems will dramatically exceed what people do with their data and tools today. Every platform that goes headless (which will be anyone that wants to take advantage of agents) will need to adopt a model like this. Some may fight it initially but it’s an inevitably as there will always be more and more agents outside your platform than people. Overall, there’s a lot of really interesting changes left to come in software due to headless use of these systems. Early days.

English

112

936

141.5K

Manish Sharma retweetledi

Google DeepMind@GoogleDeepMind·4d

AI co-clinician is our new research initiative to help explore how multimodal agents could better support healthcare workers and patients. 🩺 Here’s a snapshot of our progress 🧵

English

221

1.2K

335.1K

Manish Sharma retweetledi

Derya Unutmaz, MD@DeryaTR_·4d

Wow, this is so cool! Real-time AI doctor, via video, from Google DeepMind! The impact of this will be massive in healthcare!

Google DeepMind@GoogleDeepMind

AI co-clinician is our new research initiative to help explore how multimodal agents could better support healthcare workers and patients. 🩺 Here’s a snapshot of our progress 🧵

English

440

88K

Manish Sharma retweetledi

Aakash Gupta@aakashgupta·4d

Anthropic just shipped a security product that does what every engineer has wanted for 20 years: stop sending you 864,603 false alarms a year. The OX Security 2026 benchmark ran the math across 216 million findings at 250 organizations. The average enterprise pulls 865,398 security alerts annually. 795 are real. 91% of SAST findings are false positives. After 15 of their first 20 alerts come back as garbage, engineers stop investigating. A 10-developer team burns 24 hours a week on noise. Real CVEs sit underneath the alarms for months. This is the wound the entire AppSec industry has been bleeding from for two decades, and the valuations show it. Snyk peaked at $8.5B in 2021. BlackRock has it marked at $3.7B. A PE firm offered under $3B this year and got rejected. Synopsys sold its Software Integrity Group for $2.1B last October. Veracode went for $2.5B in 2022. Checkmarx is shopping itself at $2.5B with no taker. Snyk's growth decelerated to 12% last quarter while Palo Alto printed 16% and CrowdStrike printed 21%. Every legacy vendor has been trying to fix this with bolted-on AI for three years. Snyk DeepCode. GitLab Advanced SAST. Datadog Bits AI. IBM watsonx. Each wraps an LLM around the same rules engine that produced the noise. Now read the Anthropic line: "validates each finding to cut false positives, and suggests patches you can review and approve." Anthropic's architecture starts at the model. Everyone else started at the rules engine and tried to bolt context on top. If the validation layer trims 91% noise to something engineers will actually clear, the buyer stops paying for detection. Detection became a commodity years ago. Triage carries the pricing power now. The valuations were already catching up to the constraint. Today accelerated it.

Claude@claudeai

Claude Security is now in public beta for Claude Enterprise customers. Claude scans your codebase for vulnerabilities, validates each finding to cut false positives, and suggests patches you can review and approve.

English

176

51.5K

Manish Sharma retweetledi

darkzodchi@zodchiii·4d

Anthropic CISO just told you that 90% of their code is written by Claude. Then he explained how they protect their own secrets while doing it. Why your .env file is the weakest link in your entire AI workflow? Watch it, then grab the full security config below👇

darkzodchi@zodchiii

x.com/i/article/2049…

English

145

1.3K

443.2K

Manish Sharma retweetledi

Andrew Ng@AndrewYNg·4d

How we prompt AI is very different in 2026 than 2022 when ChatGPT came out. I'm teaching a new course, AI Prompting for Everyone, to help you become an AI power user — whatever your current skill level. It covers skills that apply across ChatGPT, Gemini, Claude, and other AI tools. How to use deep research mode for well-researched reports on complex questions. How to give AI the right context, including more documents and images than most people realize you can provide. When to ask AI to think hard for several minutes on important decisions like what car to buy, what to study, or what job to take. And how to use AI to generate images, analyze data, and build simple games and websites. I also cover intuitions about how these models work under the hood, so you know when to trust an answer and when not to. Along the way, you'll see flying squirrels, a creativity test, some of my old family photos, and fireworks. Join me at deeplearning.ai/courses/ai-pro…

English

154

768

4.3K

717.5K

Manish Sharma retweetledi

Andrej Karpathy@karpathy·4d

Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights: The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons: 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing. 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc. 3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc. I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3). The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to... Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors.

Stephanie Zhan@stephzhan

@karpathy and I are back! At @sequoia AI Ascent 2026. And a lot has changed. Last year, he coined “vibe coding”. This year, he’s never felt more behind as a programmer. The big shift: vibe coding raised the floor. Agentic engineering raises the ceiling. We talk about what it means to build seriously in the agent era. Not just moving faster. Building new things, with new tools, while preserving the parts that still require human taste, judgment, and understanding.

English

252

718

5.4K

754.1K

Manish Sharma retweetledi

Anthropic@AnthropicAI·5d

BioMysteryBench, our new bioinformatics eval, tests whether Claude can devise creative solutions to open-ended research problems. Read more: anthropic.com/research/Evalu…

English

335

61.2K

Manish Sharma retweetledi

Aakash Gupta@aakashgupta·5d

SquareMind raised $18M to put a robotic arm in dermatology clinics that scans your entire body in minutes. The lead investor is the signal here. Sonder Capital, co-founded by Fred Moll. Same Fred Moll who founded Intuitive Surgical and built the Da Vinci robot, now a $160B business. Same Fred Moll who founded Auris Health and sold it to J&J for $3.4B. He keeps running the same playbook. The robot does not diagnose. The robot acquires. The doctor reviews and decides. Why dermatology now? Because the bottleneck in skin cancer detection has been hiding in plain sight for nine years. In 2017, Andre Esteva's Stanford team published in Nature. A neural network trained on 129,450 skin images matched the diagnostic accuracy of 21 board-certified dermatologists on melanoma classification. The AI was solved. That was 2017. Your dermatology appointment in 2026 is still a doctor visually scanning your body for ten minutes, comparing what they see to memory, with no standardized record from last year. Nine years of model improvements. Almost no clinical deployment. The reason is upstream of the AI. Esteva's CNN classified well-curated, cropped, dermatologist-selected close-ups of confirmed lesions. It needed someone to point a dermoscope at the right mole, with the right lighting, at the right angle, before it could do anything. Handheld dermoscopes capture one mole at a time. Manual and inconsistent across visits. Useless for time-series comparison. Now look at what 80% means. 80% of melanomas are new lesions. Brand new ones that did not exist at the last appointment. The only question that matters in skin screening is what is here that was not here before. Skin cancer hits 20% of Americans. Melanoma diagnoses are up 42% over the last decade. No human can answer the change-detection question from memory across hundreds of moles on thousands of patients per year. Swan acquires every square centimeter at dermoscopic resolution. Same angle. Same lighting. Same distance. Every visit. The AI does not classify lesions in isolation. It compares your back today to your back six months ago, flags what changed, and hands the dermatologist a sorted worklist. The classifier was ready in 2017. The pixels weren't. SquareMind built the camera that AI dermatology needed nine years ago. And the hardware is just the wedge. The longitudinal skin database it builds, patient by patient, visit by visit, is the actual company. Whoever owns the time-series of every mole on every patient becomes the infrastructure layer every dermatology AI tool runs on top of. The robot is the trojan horse. The data is the moat.

English

264

71.6K

Manish Sharma retweetledi

Ronin@DeRonin_·5d

Andrej Karpathy: "90% of what AI twitter tells you to learn will be dead in 6 months" Here are 10 things senior AI engineers stopped wasting time on: 1. AutoGen / AG2: moved to community maintenance, releases stalled. dead for production 2. CrewAI: demos well, breaks in production. engineers building real systems already moved off it 3. Autonomous agent pitches: the AutoGPT / BabyAGI wave is dead in product form. the industry settled on supervised, bounded, evaluated agents 4. Agent app stores / marketplaces: promised since 2023, zero enterprise traction 5. SWE-bench leaderboard chasing: researchers proved nearly every public benchmark can be gamed without solving the underlying task 6. Microsoft Semantic Kernel: unless you're locked into Microsoft enterprise stack, it's not where the ecosystem is heading 7. DSPy: philosophical merit, niche audience. not a general agent framework 8. Horizontal "build any agent" platforms: Google Agentspace, AWS Bedrock Agents, Copilot Studio. confusing, slow-shipping, the math still favors building yourself 9. Per-seat SaaS pricing for agent products: market moved to outcome-based. per-seat is already dead 10. The framework that went viral on HN this week: wait 6 months. if it still matters, it'll be obvious what actually compounds instead: - context engineering - tool design - orchestrator-subagent pattern - eval discipline - the harness mindset (harness > model, always) - MCP as the protocol layer be few steps ahead than your competitors and outperform this market till it became mass-opinion study this.

Rohit@rohit4verse

x.com/i/article/2048…

English

289

2.5K

401.2K

Manish Sharma retweetledi

Avid@Av1dlive·5d

Andrej Karpathy : 10x engineers are normal. real agentic engineers are 100x this guy just shipped the playbook to become 100x context engineering. tool design. orchestrator-subagent. evals. the harness mindset. watch & bookmark it for this weekend

Rohit@rohit4verse

x.com/i/article/2048…

English

340

548.6K

Manish Sharma retweetledi

Rohit@rohit4verse·6d

Harrison Chase(LangChain CEO) just walked through four ways to give an agent memory. All four assume the model is still holding the right tokens. It isn't. At token 4,096 the cache ran a silent eviction nobody wrote. The user's name was in that batch. First founder to write the eviction policy ships a 100B agent that remembers a person.

Siddharth@Pseudo_Sid26

x.com/i/article/2049…

English

487

107.1K

Keşfet

@adamghowiba @JayaGup10 @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA