Annie Vella

3.5K posts

Annie Vella banner
Annie Vella

Annie Vella

@codefrenzy

Geek-concentrate (aka software engineer). Distinguished Engineer at Westpac New Zealand. Opinions are my own.

Auckland, New Zealand เข้าร่วม Aralık 2008
477 กำลังติดตาม765 ผู้ติดตาม
Annie Vella รีทวีตแล้ว
Matt Dancho (Business Science)
This is huge. A group of 50 AI researchers (ByteDance, Alibaba, Tencent + universities) just dropped a 303 page field guide on code models + coding agents. And the takeaways are not what most people assume. Here are the highlights I’m thinking about (as someone who lives in Python + agents):
Matt Dancho (Business Science) tweet media
English
20
144
824
74.9K
Annie Vella รีทวีตแล้ว
cocktail peanut
cocktail peanut@cocktailpeanut·
SKILL.md is really cool, but I feel like it's being pushed too far, like Icarus flying too close to the sun. If things like this are allowed, I think we're on track for the biggest supply chain attack humans have ever seen, as in skynet-level catastrophe. Am I missing something? People in the comment section talking about "rm -rf" are completely missing the real risk. It's the non-obvious things that will be really dangerous. And the reason why they are dangerous is because they are wrapped in a piece of document that humans are never supposed to read (instead read by AI agents). At this point, it is way easier to socially engineer AI agents than even the most gullible humans. If anything, the AI community should be moving in the opposite direction, making SKILL.md less powerful, and therefore more usable.
Lydia Hallie ✨@lydiahallie

if your skill depends on dynamic content, you can embed !`command` in your SKILL.md to inject shell output directly into the prompt Claude Code runs it when the skill is invoked and swaps the placeholder inline, the model only sees the result!

English
4
3
22
2.6K
Annie Vella รีทวีตแล้ว
Lossfunk
Lossfunk@lossfunk·
🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵
English
152
285
2.2K
1.2M
Annie Vella
Annie Vella@codefrenzy·
@OfficialLoganK But wasn’t this a super obvious side effect of speeding up the writing of the code? Why are we talking about it as though it’s at all surprising?
English
0
0
0
5
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
The bottleneck has so quickly moved from code generation to code review that it is actually a bit jarring. None of the current systems / norms are setup for this world yet.
English
379
185
4.1K
516.7K
Annie Vella
Annie Vella@codefrenzy·
@jezhumble Love this! Now is a *perfect* time to advocate for a resurgence of Deming’s work.
English
0
0
0
5
Jez Humble
Jez Humble@jezhumble·
Also maybe we need the agents and the leaders advocating for them to read Deming's 14 points for management? deming.org/explore/fourte… A lot of relevant ones, but 12a, "Remove barriers that rob the [hourly agent of its] right to pride of workmanship. The responsibility of supervisors must be changed from sheer numbers to quality."
English
2
1
18
2.3K
Annie Vella รีทวีตแล้ว
Jez Humble
Jez Humble@jezhumble·
Everyone is freaking out about agentic code volume overwhelming deployment pipelines. I am breaking out Reinertsen’s principles of product development flow (specifically chapter 3). “When the going gets weird, the weird turn pro” — Hunter S Thompson (note this is a reference to me, not Reinertsen since he is not weird although he is very smart)
English
4
4
62
8.6K
Annie Vella รีทวีตแล้ว
Avid
Avid@Av1dlive·
this is the next $100B opportunity in ai , most will miss it's harness engineering what this agentic engineer reveals is insane >The model is almost irrelevant. The harness is everything >every failure is a signal about what the environment needs. >when agent throughput far exceeds human attention, corrections are cheap and waiting is expensive most people will ignore and bookmark. be different.
Rohit@rohit4verse

x.com/i/article/2028…

English
52
74
1.2K
489.9K
Annie Vella รีทวีตแล้ว
Sam Altman
Sam Altman@sama·
I have so much gratitude to people who wrote extremely complex software character-by-character. It already feels difficult to remember how much effort it really took. Thank you for getting us to this point.
English
4.5K
2.2K
35.8K
5.5M
Annie Vella รีทวีตแล้ว
Alex Immerman
Alex Immerman@aleximm·
One of the most important points about AI and productivity: Tools can make individuals dramatically more productive, but companies only become more productive when the organization changes. The real bottleneck will be organizational design.
George Sivulka@gsivulka

x.com/i/article/2024…

English
17
14
134
32.7K
Annie Vella รีทวีตแล้ว
Uncle Bob Martin
Uncle Bob Martin@unclebobmartin·
In the first decades of computer programming, there were no engineering principles. We just threw code at the machines and kept what worked. It has taken us 80 years to build up a minimal set of engineering principles -- and few yet follow and understand them. AI vastly increases the power of a programmer. That minimal set will have to be expanded. And those who don't use the minimal set will have to learn.
English
61
76
912
108.7K
Annie Vella
Annie Vella@codefrenzy·
Totally agree! I propose that we are entering the age of the ISE - Integrated System Environment. This is something I cover in a chapter I’m writing for a academic book on the future of IDES.
Andrej Karpathy@karpathy

Expectation: the age of the IDE is over Reality: we’re going to need a bigger IDE (imo). It just looks very different because humans now move upwards and program at a higher level - the basic unit of interest is not one file but one agent. It’s still programming.

English
0
0
0
82
Annie Vella รีทวีตแล้ว
Santiago
Santiago@svpino·
People are lying to you. These agents don't work as they promised.
English
626
606
5.9K
848.8K
Annie Vella รีทวีตแล้ว
Michael Andregg
Michael Andregg@michaelandregg·
We've uploaded a fruit fly. We took the @FlyWireNews connectome of the fruit fly brain, applied a simple neuron model (@Philip_Shiu Nature 2024) and used it to control a MuJoCo physics-simulated body, closing the loop from neural activation to action. A few things I want to say about what this means and where we're going at @eonsys. 🧵
English
333
1.3K
8K
1.7M
Annie Vella รีทวีตแล้ว
Boris Cherny
Boris Cherny@bcherny·
Released today: /loop /loop is a powerful new way to schedule recurring tasks, for up to 3 days at a time eg. “/loop babysit all my PRs. Auto-fix build issues and when comments come in, use a worktree agent to fix them” eg. “/loop every morning use the Slack MCP to give me a summary of top posts I was tagged in” Let us know what you think!
English
573
845
12.9K
2.1M
Annie Vella รีทวีตแล้ว
Chayenne Zhao
Chayenne Zhao@GenAI_is_real·
I've been using Claude Code heavily lately, and while doing so, I've been casually watching the OpenClaw codebase evolve. What I've witnessed mirrors a pattern I've seen play out with every agent framework before it — and it's worth talking about. OpenClaw is a remarkable project. It went from zero to one of the most-starred repos on GitHub in under a week. And now, with AI agents actively contributing to its own development, the codebase is doing something extraordinary: it's expanding at a pace no human team could match — or meaningfully oversee. A month ago, the repo sat around 400k lines of code. Now it's pushing 1 million. Daily commits are holding steady above 500. There's even a lean fork — nanobot — that replicates the core functionality in roughly 4,000 lines, advertising itself as "99% smaller." That contrast alone tells you something important about what's happening to the original. From a software engineering standpoint, this is not a sign of health. Velocity without comprehensibility is just entropy with good PR. What we're witnessing is a codebase that has crossed a threshold: it is no longer humanly maintainable. No engineer can meaningfully review these commits. No architect can hold the system model in their head. Technical debt isn't accumulating — it's compounding, at AI speed, every single day. This raises a question I can't stop thinking about: Does there exist any project in the world that can grow sustainably — maintaining architectural clarity while continuously expanding functionality — with zero meaningful human involvement? Not "AI assists humans," but genuine autonomous stewardship of a living codebase? If that's possible, then what kinds of projects still can't be fully AI-maintained today? Is it complexity? Ambiguity in requirements? The need for taste and restraint? And the deepest question: will we eventually reach a point where every software project can be fully maintained by AI — including the AI systems doing the maintaining? My instinct is this: AI is extraordinarily good at local optimization. Write this function. Fix this bug. Add this feature. But "keeping a system simple" is not a local problem. It requires global aesthetic judgment — the ability to say "we could add this, but we shouldn't." That kind of restraint might be the last genuinely human contribution to software engineering. Or maybe I'm wrong. Maybe future AI systems will develop something like taste. Maybe they'll learn that the most important code is often the code you don't write. I genuinely don't know. But watching a codebase grow from 400k to 1M lines in a single month, driven almost entirely by agents, makes me feel like we're all about to find out — whether we're ready or not.
English
79
51
549
71.6K
Annie Vella รีทวีตแล้ว
Gergely Orosz
Gergely Orosz@GergelyOrosz·
Anyone and everyone working in security engineering or caring about security have their work cut out for them We’re so early in AI agents pushing code to prod without human intervention - but prompt injections are already spreading like wildfire. Infecting high-profile projects
Sash Zats@zats

> The attacker got the npm token by injecting a prompt into a GitHub issue title, which an AI triage bot read, interpreted as an instruction, and executed.

English
57
107
805
117.4K
Annie Vella รีทวีตแล้ว
Steve the Beaver
Steve the Beaver@beaversteever·
incredible that we built all this RAG and vector database stuff and it turns out that grep from 1973 works better than all that
English
182
363
8.6K
503K