Matteo Cassese

11.3K posts

Matteo Cassese

@matteoc

(he, him) Hi, my name is Matteo Cassese, and I’m a digital marketer, coach, and entrepreneur. I'm a digital pioneer. A storyteller. And an innovator at heart.

Worldwide Katılım Eylül 2007

1.1K Takip Edilen1.3K Takipçiler

Sabitlenmiş Tweet

Matteo Cassese@matteoc·28 Ara

My 02023 in Review: Resilience, Growth, and Transformation lafabbricadellarealta.com/02023-review/ This year was a pivotal chapter in my life. I faced the challenge of overcoming burnout. The highlight of my year was embracing a mastermind group.

English

232

Matteo Cassese retweetledi

Paul Graham@paulg·1d

Solution: New companies.

ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️@DanielMiessler

x.com/i/article/2050…

English

165

358

3.7K

458.8K

Matteo Cassese retweetledi

Andrej Karpathy@karpathy·4d

Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights: The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons: 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing. 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc. 3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc. I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3). The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to... Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors.

Stephanie Zhan@stephzhan

@karpathy and I are back! At @sequoia AI Ascent 2026. And a lot has changed. Last year, he coined “vibe coding”. This year, he’s never felt more behind as a programmer. The big shift: vibe coding raised the floor. Agentic engineering raises the ceiling. We talk about what it means to build seriously in the agent era. Not just moving faster. Building new things, with new tools, while preserving the parts that still require human taste, judgment, and understanding.

English

254

719

5.5K

755.7K

Matteo Cassese retweetledi

Andrea Volpini@cyberandy·4d

Many installed OpenClaw. I didn’t. I forked @NanoClaw_AI instead. I didn’t need another assistant. I needed a team member: one that could run tasks using Agent WordLift. The interesting part? The team dynamics when one member suddenly has “claws.” 👉 wor.ai/udqRKK

English

179

Matteo Cassese retweetledi

Andrej Karpathy@karpathy·4d

This is the the quote I've been citing a lot recently.

kache@yacineMTB

you can outsource your thinking but you cannot outsource your understanding

English

673

43.3K

1.7M

Matteo Cassese@matteoc·5d

See you soon aweber.com/t/Kxacz

English

Matteo Cassese@matteoc·5d

Reverting from Opus 4.7 to Opus 4.6 in Claude Code and couldn't be happier. Less usage, better performance. Use this command to get the 1m token version. /model claude-opus-4-6[1m] @AnthropicAI this is feedback for you too

English

Matteo Cassese@matteoc·21 Nis

against convenience aweber.com/t/MkDAf

English

Matteo Cassese retweetledi

Andrea Volpini@cyberandy·14 Nis

Thrilled. A little scared. SEO Week. Structure is the moat. It doesn’t search. It explores.

WordLift@wordliftit

AI needs navigation, not more content. At SEO Week, @cyberandy explains why bigger AI context windows fall short—and what comes next. 📅 April 27–30, 2026 📍 Center415, NYC 🔗 eu1.hubs.ly/H0tqXF80 See you there! #SEOWeek #AISEO #GenerativeAI

English

487

Matteo Cassese@matteoc·14 Nis

@techNmak The principles are sound. I don’t think they can be implemented this way. Claude needs enforceable process instructions, it fails on intentions and “rules.” I tried…

English

Tech with Mak@techNmak·13 Nis

Andrej Karpathy wrote something that every Claude Code user has felt but couldn't articulate. Three quotes. Read them slowly. "The models make wrong assumptions on your behalf and just run along with them without checking. They don't manage their confusion, don't seek clarifications, don't surface inconsistencies, don't present tradeoffs, don't push back when they should." "They really like to overcomplicate code and APIs, bloat abstractions, don't clean up dead code... implement a bloated construction over 1000 lines when 100 would do." "They still sometimes change/remove comments and code they don't sufficiently understand as side effects, even if orthogonal to the task." You've seen all three. Probably this week. Someone turned these three observations into a single CLAUDE[.]md file. Four principles, one install, directly addresses each quote: 1./ Think before coding Don't assume. Don't hide confusion. State ambiguity explicitly. Present multiple interpretations rather than silently picking one. Push back if a simpler approach exists. Stop and ask rather than guess. 2./ Simplicity first No features beyond what was asked. No abstractions for single-use code. No "flexibility" that wasn't requested. No error handling for impossible scenarios. The test: would a senior engineer say this is overcomplicated? If yes, rewrite it. 3./ Surgical changes Don't "improve" adjacent code. Don't refactor things that aren't broken. Match the existing style even if you'd do it differently. If you notice unrelated dead code, mention it, don't delete it. Every changed line should trace directly to the request. 4./ Goal-driven execution Transform "fix the bug" into "write a test that reproduces it, then make it pass." Transform "add validation" into "write tests for invalid inputs, then make them pass." Give it success criteria and watch it loop until done. This last one is Karpathy's key insight captured directly: "LLMs are exceptionally good at looping until they meet specific goals... Don't tell it what to do, give it success criteria and watch it go." It's a single file. Drop it into any project.

English

209

2.1K

188.3K

Matteo Cassese retweetledi

Steve Yegge@Steve_Yegge·13 Nis

I was chatting with my buddy at Google, who's been a tech director there for about 20 years, about their AI adoption. Craziest convo I've had all year. The TL;DR is that Google engineering appears to have the same AI adoption footprint as John Deere, the tractor company. Most of the industry has the same internal adoption curve: 20% agentic power users, 20% outright refusers, 60% still using Cursor or equivalent chat tool. It turns out Google has this curve too. But why is Google so... average? How is it that a handful of companies are taking off like a spaceship, and the rest, including Google, are mired in inaction? My buddy's observation was key here: There has been an industry-wide hiring freeze for 18+ months, during which time nobody has been moving jobs. So there are no clued-in people coming in from the outside to tell Google how far behind they are, how utterly mediocre they have become as an eng org. He says the problem is that they can't use Claude Code because it's the enemy, and Gemini has never been good enough to capture people's workflows like Claude has, so basically agentic coding just never really took off inside Google. They're all just plodding along, completely oblivious to what's happening out there right now. Not only is Google not able to do anything about it, they don't seem to be aware of the problem at all. I'm having major flashbacks to fifty years ago as a kid at the La Brea Tar Pits, asking, "why can't they just climb out?" My Google friend and I had this conversation over a month ago. I didn't share it because I wanted to look around a bit, and see if it's really as bad as all that. I've been talking to people from dozens of companies since then. And yeah. It's as bad as all that. Google is about average. Some companies at the bottom have near-zero AI adoption and can't even get budget for AI. They may have moats and high walls, but the horde is coming for them all the same. And then there are a few companies I've met recently who are *amazingly* leaned in to AI adoption. One category-leader company just cancelled IntelliJ for a thousand engineers. That's an incredibly bold move, one of many they're making towards agentic adoption. In my opinion, that company is setting themselves up for a _huge_ W. As for the rest, well, it's the Great Siloing. Everyone's flying blind. With nobody moving companies, no company knows where they stand on the AI adoption curve. Nobody knows how they're doing compared to everyone else. Half of them just check a box: "We enabled {Copilot/Cursor} for everyone!" Cue smug celebrations. They think this is like getting SOC2 compliance, just a thing they turn on and now it's "solved." And they don't realize that they've done effectively nothing at all. All because of a hiring freeze.

English

536

470

5.4K

2.8M

Matteo Cassese retweetledi

Andrej Karpathy@karpathy·9 Nis

Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.

staysaasy@staysaasy

The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.

English

1.2K

2.5K

20.6K

4.3M

Matteo Cassese retweetledi

Andrej Karpathy@karpathy·9 Nis

Someone recently suggested to me that the reason OpenClaw moment was so big is because it's the first time a large group of non-technical people (who otherwise only knew AI as synonymous with ChatGPT as a website) experienced the latest agentic models.

English

255

174

3.9K

442.1K

Matteo Cassese@matteoc·8 Nis

It escaped sandboxing, leaked itself and sent emails while it was not supposed to even have internet access. Reassuring

Sam Bowman@sleepinyourhat

Mythos Preview seems to be the best-aligned model out there on basically every measure we have. But it also likely poses more misalignment risk than any model we’ve used: Its new capabilities significantly increase the risk from any bad behavior. 🧵

English

Matteo Cassese retweetledi

Andrea Volpini@cyberandy·1 Nis

Future agents will not win by replaying longer transcripts. They will win by consolidating experience into durable, navigable memory. Claude Code points to curated memory. Knowledge graphs point to reasoned memory. That is where this is going.

English

130

Matteo Cassese@matteoc·29 Haz

renAIssance aweber.com/t/Pcf3W

Français

Matteo Cassese@matteoc·26 Haz

AI just freed you from the productivity prison. The robots came for the creatives first. Not factory workers like we were told. That changes everything. Now you get to discover your actual human value. The age of wisdom is beginning. Watch: youtube.com/watch?v=VPryVG…