
Rishi Kulkarni
3.9K posts

Rishi Kulkarni
@rishikulkarni
Co-founder https://t.co/Zagg7qHRNk, Co-founder @revv_so (acquired LegalZoom). Founder @1clickio (acquired Freshworks)





His Majesty The King addresses a Joint Session of the United States Congress







Welcome Salesforce Headless 360: No Browser Required! Our API is the UI. Entire Salesforce & Agentforce & Slack platforms are now exposed as APIs, MCP, & CLI. All AI agents can access data, workflows, and tasks directly in Slack, Voice, or anywhere else with Salesforce Headless 360. Faster builds, agentic everything. 🚀 #Salesforce #Agentforce #AI venturebeat.com/ai/salesforce-…


🚨 BREAKING: Stanford's 423-page AI Index Report 2026 is out! [Bookmark it below]. These are its key takeaways: 1. AI capability is not plateauing. It is accelerating and reaching more people than ever. 2. The U.S.-China AI model performance gap has effectively closed. 3. The U.S. hosts the most AI data centers, with the majority of its chips fabricated by one Taiwanese foundry. 4. AI models can win a gold medal at the International Mathematical Olympiad but cannot reliably tell time, an example of what researchers call the jagged frontier of AI. 5. Robots still fail at most household tasks, even as they excel in controlled environments. 6. Responsible AI is not keeping pace with AI capability, with safety benchmarks lagging and incidents rising sharply. 7. The U.S. leads in AI investment, but its ability to attract global talent is declining. 8. AI adoption is spreading at historic speed, and consumers are deriving substantial value from tools they often access for free. 9. Productivity gains from AI are appearing in many of the same fields where entry-level employment is starting to decline. 10. AI’s environmental footprint is expanding alongside its capabilities. 11. AI models for science can outperform human scientists, though bigger models do not always perform better. 12. AI is transforming clinical care, but rigorous evidence remains limited. 13. Formal education is lagging behind AI, but people are learning AI skills at every stage of life. 14. AI sovereignty is becoming a defining feature of national policy, but capabilities remain uneven, even as open-source development helps to redistribute who participates. 15. AI experts and the public have very different perspectives on the technology’s future, and global trust in institutions to manage AI is fragmented. - 👉 Download the full document below. 👉 To learn more about AI's legal and ethical challenges, join my newsletter's 93,500+ subscribers (link below).

from weights → context → harness engineering (evolution of agent landscape from 2022-26) the biggest shift in AI agents had nothing to do with making models smarter. it was about making the environment around them smarter. here's how agent engineering evolved in just 4 years, across three distinct phases: 𝗽𝗵𝗮𝘀𝗲 𝟭: 𝘄𝗲𝗶𝗴𝗵𝘁𝘀 (𝟮𝟬𝟮𝟮) everything was about the model itself. bigger models, more data, better training. scaling laws told us that progress = more parameters. RLHF and fine-tuning shaped behavior. if you wanted a better agent, you trained a better model. this worked great for single-turn tasks. ask a question, get an answer. but it hit a wall fast. updating one fact meant retraining. auditing behavior was nearly impossible. and personalization across millions of users from one frozen set of weights? not happening. 𝗽𝗵𝗮𝘀𝗲 𝟮: 𝗰𝗼𝗻𝘁𝗲𝘅𝘁 (𝟮𝟬𝟮𝟯-𝟮𝟬𝟮𝟰) the realization: you don't always need to change the model. you can change what the model sees. prompt engineering, few-shot examples, chain-of-thought, RAG. suddenly the same frozen model could behave completely differently based on what you put in front of it. developers stopped fine-tuning and started iterating on prompts and retrieval pipelines instead. it was cheaper, faster, and surprisingly effective. but context windows are finite. long prompts get noisy. models attend unevenly (the "lost in the middle" problem is real). and every new session starts fresh with zero memory of what happened before. context made agents flexible. it didn't make them reliable. 𝗽𝗵𝗮𝘀𝗲 𝟯: 𝗵𝗮𝗿𝗻𝗲𝘀𝘀 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 (𝟮𝟬𝟮𝟱-𝟮𝟬𝟮𝟲) this is where we are now, and the shift is fundamental. the question changed from "what should we tell the model?" to "what environment should the model operate in?" the model is no longer the sole location of intelligence. it sits inside a harness that includes persistent memory, reusable skills, standardized protocols (like MCP and A2A), execution sandboxes, approval gates, and observability layers. the model stays the same. what changes is the task it's being asked to solve. a concrete example: a coding agent asked to implement a feature, run tests, and open a PR. without a harness, the model must keep repo structure, project conventions, workflow state, and tool interactions all inside a fragile prompt. with a harness, persistent memory supplies context, skill files encode conventions, protocolized interfaces enforce correct schemas, and the runtime sequences steps and handles failures. same model. completely different reliability. 𝘁𝗵𝗲 𝗽𝗮𝘁𝘁𝗲𝗿𝗻 𝗮𝗰𝗿𝗼𝘀𝘀 𝗮𝗹𝗹 𝘁𝗵𝗿𝗲𝗲 𝗽𝗵𝗮𝘀𝗲𝘀 𝗶𝘀 𝘀𝗶𝗺𝗽𝗹𝗲: - weights encoded knowledge in parameters (fast but rigid) - context staged knowledge in prompts (flexible but ephemeral) - harnesses externalized knowledge into persistent infrastructure (reliable and governable) each phase didn't replace the previous one. it layered on top. weights still matter. context engineering still matters. but the center of gravity has moved outward. the most consequential improvements in agent reliability today rarely come from changing the base model. they come from better memory retrieval, sharper skill loading, tighter execution governance, and smarter context budget management. building better agents increasingly means building better environments for models to operate in. there's a great paper on this: Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering paper: arxiv.org/abs/2604.08224 i also published this deep dive (article) on agent harness engineering, covering the orchestration loop, tools, memory, context management, and everything else that transforms a stateless LLM into a capable agent.

JUST IN: Use of AI in the office is reportedly creating a flood of “workslop” that takes longer to fix than do from scratch.

Every time I see a tweet saying “I can vibe code this in a weekend” - I think of the slack notification system.. It takes time, persistence and effort to get the details right. Sure, a lot of simple workflows will get vibe coded away. And maybe you can put this in Claude Code and get the code right in one shot. But quality, depth and great systems will still have value and take time. You can’t vibe code lessons. Now and forever.

How do you give an AI agent a GitHub token without the agent actually seeing the token? 🔐 We’re launching outbound Workers for Sandboxes. Programmatically inject credentials, log egress, and enforce zero-trust policies at the network level—all transparently. #AgentsWeek cfl.re/4tfSt1G

The more enterprises I talk to about AI agent transformation, the more it’s clear that there is going to be a new type of role in most enterprises going forward. The job is to be the agent deployer and manager in teams. Here’s the rough JD: This person will need to figure out what are the highest leverage set of workflows on a team are (either existing or new ones) where agents can actually drive significantly more value for the team and company. In general, it’s going to be in areas where if you threw compute (in the form of agents) at a task you could either execute it 100X faster or do it 100X more times than before. Examples would be processing orders of magnitude more leads to hand them off to reps with extra customer signal, automating a contracting review and intake process, streamlining a client onboarding process to reduce as many straps as possible, setting up knowledge bases than the whole company taps into, and so on. This person’s job is to figure out what the future state workflow needs to look like to drive this new form of automation, and how to connect up the various existing or new systems in such a way that this can be fulfilled. The gnarly part of the work is mapping structured and unstructured data flows, figuring out the ideal workflow, getting the agent the context it needs to do the work properly, figuring out where the human interfaces with the agent and at what steps, manages evals and reviews after any major model or data change, and runs and manages the agents on an ongoing basis tracking KPIs, and so on. The person must be good at mapping the process and understanding where the value could be unlocked and be relatively technical, and has full autonomy to connect up business systems and drive automation. This means they’re comfortable with skills, MCP, CLIs, and so on, and the company believes it’s safe for them to do so. But also great operationally and at business. It may be an existing person repositioned, or a totally net new person in the company. There will likely need to be one or more of these people on every team, so it’s not a centralized role per se. It may rile up into IT or an AI team, or live in the function and just have checkpoints with a central function. This would also be a fantastic job for next gen hires who are leaning into AI, and are technical, to be able to go into. And for anyone concerned about engineers in the future, this will be an obvious area for these skills as well.

Jesse Genet on Agentic Parenting Jesse Genet joins a16z's Sarah Wang and Katherine Boyle to discuss her journey from founder to parent, how she's using agents in her household, and how AI could transform parenting for the better. 00:00 YC founder turned homeschool mom 03:00 Discovering Claude Code and agentic building 06:00 Building while homeschooling 4 kids under 5 11:00 How AI generates personalized lesson plans and logs progress 18:00 Jesse's 11-agents 27:05 Agent tech stack deep dive 33:56 How agents improve daily life 40:04 Letting kids interact with AI: values, risks, and the future of parenting @jessegenet @KTmBoyle @sarahdingwang


Early versions of Mythos Preview often exhibited overeager and/or destructive actions—the model bulldozing through obstacles to complete a task in a way the user wouldn't want. We looked at what was going on inside the model during particularly concerning examples. (3/14)



