Steve Wilson

18.3K posts

Steve Wilson banner
Steve Wilson

Steve Wilson

@virtualsteve

Leading the charge in AI security. Chief AI and Product Officer @ Exabeam, Author @ O'Reilly, Project Lead at OWASP #cybersecurity #ai #cloud

San Jose, CA Katılım Mayıs 2008
668 Takip Edilen4.3K Takipçiler
Sabitlenmiş Tweet
Steve Wilson
Steve Wilson@virtualsteve·
How will we use AI Agents in cyber defense? Check out this snippet from my interview in Davos.
English
4
1
9
691
Steve Wilson retweetledi
Mo
Mo@atmoio·
Andrej Karpathy admits he’s struggling with AI
Stephanie Zhan@stephzhan

@karpathy and I are back! At @sequoia AI Ascent 2026. And a lot has changed. Last year, he coined “vibe coding”. This year, he’s never felt more behind as a programmer. The big shift: vibe coding raised the floor. Agentic engineering raises the ceiling. We talk about what it means to build seriously in the agent era. Not just moving faster. Building new things, with new tools, while preserving the parts that still require human taste, judgment, and understanding.

English
113
192
3K
310.5K
Steve Wilson retweetledi
Mario Nawfal
Mario Nawfal@MarioNawfal·
Electric surfboards are about to make jet skis look ancient. Silent, fast, and way too fun.
English
1.4K
3.8K
43.3K
6.6M
Steve Wilson retweetledi
Ten18 by Exabeam
Ten18 by Exabeam@Ten18byExabeam·
Gone are the days where sharing MTTR or a vulnerability patching rate will satisfy the board. CISOs must articulate how AI is optimizing security operations from a business perspective, and many struggle with that. Learn more: ow.ly/7OLh50YZHBx @virtualsteve
English
0
2
2
128
Steve Wilson retweetledi
Jonathan Ross
Jonathan Ross@JonathanRoss321·
For 50 years, software engineering ran on code rationing. Writing code was expensive, so we rationed it carefully through roadmaps, RFCs, prioritization meetings, and scope reviews. This created a role: the No Engineer. No, that won't scale. No, we don't have bandwidth. No, that's out of scope. No, we need a design doc first. The No Engineer was valuable for 50 years. Every "no" saved real money. Their judgment was the rationing system. LLMs will be the end of code rationing. Code is cheap now. And while the No Engineer is explaining why something can't be done, the Yes Engineer has already shipped three versions of it. If you're a Yes Engineer, the next decade is yours.
English
392
215
2.1K
733.8K
Steve Wilson retweetledi
Lenny Rachitsky
Lenny Rachitsky@lennysan·
My biggest takeaways from Claude Code's Head of Product @_catwu: 1. Anthropic’s product development timelines have gone from six months to one month, sometimes one week, sometimes one day. Part of this acceleration is access to the latest models (i.e. Mythos). Another is shipping new products into “research preview,” making clear it's early, experimental, and might not be supported forever. Another is an evergreen "launch room "where engineers post ready features and marketing turns around announcements the next day. 2. The PM role is shifting from coordinating multi-month roadmaps to enabling teams to ship daily. As Cat puts it, “There should be less emphasis on making sure you are aligning your multi-quarter roadmaps with your partner teams and more emphasis on, OK, how can we figure out the fastest way to get something out the door?” 3. The most efficient shipping unit is an engineer with great product taste. On Cat’s team, many engineers go end-to-end—from seeing user feedback on Twitter to shipping a product by the end of the week—without a PM involved. Also, almost all the PMs on the Claude Code team have either been engineers or ship code themselves, and the designers have been front-end engineers. The roles are merging, and the most valuable skill is product taste, not job title. 4. Build products that are on the edge of working. Claude Code’s code review product failed multiple times because earlier models weren’t accurate enough. But because the prototype was already built, they could swap in Opus 4.5 and 4.6 and immediately test whether the gap was closed. Teams that wait for the model to be ready will always be a cycle behind. 5. The most underrated skill for building AI products is asking the model to introspect on its own mistakes. Cat regularly asks the model why it made an unexpected decision. The model will explain that something in the system prompt was confusing, or that it delegated verification to a subagent that didn’t check its work. This reveals what misled the model so the team can fix the harness. 6. Every model release forces their team to revisit existing products and audit their system prompt to remove features the model no longer needs. Claude Code’s to-do list was a crutch for earlier models that couldn’t track their own work. With Opus 4, the model handles it natively. Features built as scaffolding for weaker models become debt when the model catches up—so the team actively strips them. 7. Anthropic employees build custom internal tools instead of buying SaaS products. A sales team member built a web app that pulls from Salesforce, Gong, and call notes to auto-customize pitch decks—work that used to take 20 to 30 minutes now takes seconds. Their core stack is Claude Code, Cowork, and Slack. No Notion, no Linear, no Figma. 8. People underestimate how much Claude’s personality contributes to its success. As Cat describes it, “When you reflect on everyone you’ve worked with, there’s just some people where you’re like, I really like their energy, their vibe.” Claude is designed to be low-ego, positive, competent, and earnest—qualities that make it feel like a great coworker, not just a tool. This isn’t cosmetic; it’s what makes people want to use Claude for hours every day. The team has a dedicated person, Amanda, who “molds Claude’s character,” and it’s one of the hardest roles at the company because success is so subjective. 9. The future of work is managing fleets of AI agents, not doing the work yourself. Cat sees a clear progression: first, individual tasks become successful. Then people start running multiple tasks at the same time (multi-Clauding). Next, people will run 50 or 100 tasks simultaneously, which will require new infrastructure—remote execution, better interfaces for managing tasks, agents that fully verify their work, and self-improving systems that incorporate feedback. The human role shifts from doing the work to knowing which tasks to look into, verifying outputs, and giving feedback that makes the system better over time. 10. Hire people who lean into chaos and face every challenge with a smile. At Anthropic, there are weeks when a P0 on Sunday becomes a P00 by Monday and a P000 by Monday afternoon. If you get too stressed about any one thing, you’ll burn out. Their team looks for people who can look at a hard challenge and say, “Wow, that’s gonna be hard. But I’m excited to tackle it and I’m gonna do the best that I possibly can.” This mindset—optimism, resilience, and comfort with constant change—is increasingly essential as the pace of AI development accelerates. Don't miss the full conversation: youtube.com/watch?v=Pplmzl…
YouTube video
YouTube
Lenny Rachitsky@lennysan

How Anthropic’s product team moves faster than anyone else I sat down with @_catwu, Head of Product for Claude Code at @AnthropicAI, to get a peek into their unprecedented shipping pace, how AI is changing the PM role, and how to be the right amount of AGI-pilled. We discuss: 🔸 How Anthropic’s shipping cadence went from months to weeks to days 🔸 The emerging skills PMs need to develop right now 🔸 Why you should build products that don't work yet—then wait for the model to catch up 🔸 Why a 95% automation isn't really an automation 🔸 Cat’s most underrated AI skill (introspection) 🔸 What Cat actually looks for when hiring PMs now (hint: it's not traditional PM skills) Listen now 👇 youtu.be/PplmzlgE0kg

English
99
297
2.9K
840.4K
Steve Wilson retweetledi
Exabeam
Exabeam@exabeam·
What a moment for Exabeam. Accepting the 2026 Google Cloud Partner of the Year Award for Security: Analytics & Operations on stage is powerful recognition of the leadership, innovation, and momentum driving our business forward. Read the press release: ow.ly/EJyl50YNN1c
Exabeam tweet mediaExabeam tweet mediaExabeam tweet media
English
1
1
6
166
Steve Wilson retweetledi
TED Talks
TED Talks@TEDTalks·
“The lobster is loose, and it’s not going back into the tank,” says @openclaw founder @steipete. In this brand new talk from #TED2026 he shares why AI agents — built by you — are the future: t.ted.com/DPASxmF
English
71
201
1.2K
246.9K
Steve Wilson retweetledi
Viktor Oddy
Viktor Oddy@viktoroddy·
Claude Design is insane. ❤️‍🔥Just recorded a 18-min tutorial on how to build animated, award-winning websites with Claude Design + Opus 4.7!
English
334
2.1K
25.6K
3.2M
Steve Wilson retweetledi
Ten18 by Exabeam
Ten18 by Exabeam@Ten18byExabeam·
This is not a hot take: #AI can't fully replace human expertise in security operations. As @virtualsteve says, that idea is just hype. Find out how CISOs are using it to complement #security teams instead: ow.ly/V1iS50YJXwC
English
0
2
3
177
Steve Wilson
Steve Wilson@virtualsteve·
@meinardi @HowToAI_ I don't think we're seeing any slowdown in base capabilities. I think it's mostly a lack of imagination by naive users who think things have slowed down. I wrote about this last year, but my opinion hasn't changed much - linkedin.com/pulse/ai-isnt-…
English
0
0
0
15
Steve Wilson retweetledi
How To AI
How To AI@HowToAI_·
RAG is broken and nobody's talking about it. Stanford researchers exposed the fatal flaw killing every "AI that reads your docs" product in existence. It’s called "Semantic Collapse," and it happens the second your knowledge base hits critical mass. If you've noticed your AI getting "dumber" as you add more data, this is exactly why. Right now, companies are dumping thousands of documents into their AI, thinking it’s getting smarter. When you add a document to RAG, it converts it into a high-dimensional vector. Under 10,000 documents, this works perfectly. Similar concepts cluster together. But past 10,000 documents, the space fills up. The clusters overlap. The distances compress. Everything starts to look "relevant." It is a mathematical law called the Curse of Dimensionality. In a 1000-dimensional space, 99.9% of your data lives on the outer edge. All points become equidistant from each other. That perfect, relevant document you are looking for now has the exact same mathematical similarity as 50 completely irrelevant ones. The Stanford findings are brutal: At 50,000 documents, precision drops by 87%. Semantic search actually becomes worse than old-school keyword search. Adding more context doesn’t fix the AI. It makes the hallucinations worse. Your "nearest neighbor" search isn't finding the best answer anymore. It's finding everyone. We thought RAG solved hallucinations. It didn't. It just hid them behind math.
How To AI tweet media
English
202
594
2.8K
317.4K
Steve Wilson
Steve Wilson@virtualsteve·
No grudges here. I put promise in quotes because no one actually "promised" anything. I think the originally quoted research is important for people to see that Vector RAG isn't a free lunch to agent eidetic memory. I do think there's lots going on in the field of agent memory now, that's quite promising!
English
1
0
1
14
Marco Meinardi
Marco Meinardi@meinardi·
Yes sure we’ve been sharding databases for ages to address other types of scaling issues. But we’ve been pretty good at implementing world-scale distributed system that are “eventually stateful.” Why the same can’t happen with vector databases for RAG? Broken promises in this field are the order of the day, I wouldn’t hold a grudge at that.
English
1
0
0
16
Steve Wilson
Steve Wilson@virtualsteve·
It's an old idea that works for problems that are "like" this, but I'm sure it gets very application-specific in terms of how you do it. The "nice" thing about the vector DB solution was it "promised" you could just put stuff there and the LLM would "remember" it - en.wikipedia.org/wiki/Shard_(da…
English
1
0
0
30
Marco Meinardi
Marco Meinardi@meinardi·
@HowToAI_ @virtualsteve What about separating documents into multiple vector databases, each hosting an affinity group of max 10k documents, and adding a router in front?
English
1
0
1
47
Steve Wilson retweetledi
Felix Rieseberg
Felix Rieseberg@felixrieseberg·
Today is a big day! We're launching a ~ new ~ version of Claude Code in the desktop app. It's been redesigned from the ground up for parallel work and is a lot faster. It's been my main way to use Claude Code for the last few weeks.
English
616
461
9.9K
948.2K
Steve Wilson retweetledi
Big Brain AI
Big Brain AI@realBigBrainAI·
Peter Steinberger, creator of OpenClaw, on why AI agents still produce "slop" without human taste in the loop: "You can create code and run all night and then you have like the ultimate slop because what those agents don't really do yet is have taste." Peter is direct: raw capability without direction still produces mediocre output. "They are spiky smart and they're really good at things, but if you don't navigate them well, if you don't have a vision of what you're going to build, it's still going to be slop. If you don't ask the right questions, it's still going to be slop." Great AI-assisted work is defined by the human guiding it. @steipete describes his own creative process when starting a new project: "When I start a project, I have like this very rough idea what it could be. And as I play with it and feel it, my vision gets more clear. I try out things, some things don't work, and I evolve my idea into what it will become." Most people skip this part entirely, front-loading everything into a single prompt and wondering why the result feels hollow. "My next prompt depends on what I see and feel and think about the current state of the project." Each step informs the next. The work itself is the feedback loop. "But if you try to put everything into a spec up front, you miss this kind of human-machine loop. And then I don't know how something good can come out without having feelings in the loop — almost like taste." The agentic trap is what happens when you remove yourself from the process too early.
English
175
282
2.1K
486.2K
Steve Wilson retweetledi
Steve Yegge
Steve Yegge@Steve_Yegge·
I was chatting with my buddy at Google, who's been a tech director there for about 20 years, about their AI adoption. Craziest convo I've had all year. The TL;DR is that Google engineering appears to have the same AI adoption footprint as John Deere, the tractor company. Most of the industry has the same internal adoption curve: 20% agentic power users, 20% outright refusers, 60% still using Cursor or equivalent chat tool. It turns out Google has this curve too. But why is Google so... average? How is it that a handful of companies are taking off like a spaceship, and the rest, including Google, are mired in inaction? My buddy's observation was key here: There has been an industry-wide hiring freeze for 18+ months, during which time nobody has been moving jobs. So there are no clued-in people coming in from the outside to tell Google how far behind they are, how utterly mediocre they have become as an eng org. He says the problem is that they can't use Claude Code because it's the enemy, and Gemini has never been good enough to capture people's workflows like Claude has, so basically agentic coding just never really took off inside Google. They're all just plodding along, completely oblivious to what's happening out there right now. Not only is Google not able to do anything about it, they don't seem to be aware of the problem at all. I'm having major flashbacks to fifty years ago as a kid at the La Brea Tar Pits, asking, "why can't they just climb out?" My Google friend and I had this conversation over a month ago. I didn't share it because I wanted to look around a bit, and see if it's really as bad as all that. I've been talking to people from dozens of companies since then. And yeah. It's as bad as all that. Google is about average. Some companies at the bottom have near-zero AI adoption and can't even get budget for AI. They may have moats and high walls, but the horde is coming for them all the same. And then there are a few companies I've met recently who are *amazingly* leaned in to AI adoption. One category-leader company just cancelled IntelliJ for a thousand engineers. That's an incredibly bold move, one of many they're making towards agentic adoption. In my opinion, that company is setting themselves up for a _huge_ W. As for the rest, well, it's the Great Siloing. Everyone's flying blind. With nobody moving companies, no company knows where they stand on the AI adoption curve. Nobody knows how they're doing compared to everyone else. Half of them just check a box: "We enabled {Copilot/Cursor} for everyone!" Cue smug celebrations. They think this is like getting SOC2 compliance, just a thing they turn on and now it's "solved." And they don't realize that they've done effectively nothing at all. All because of a hiring freeze.
English
535
471
5.4K
2.8M
Ahsen
Ahsen@meetahsen·
@virtualsteve agents aren't dead, we just tried to make them do everything at once. learned the hard way that you have to scope their capabilities tightly or they just burn cash and hallucinate. what was the first thing your "digital worker" failed at spectacularly?
English
1
0
1
3