Eric Barroca

7.6K posts

Eric Barroca

@ebarroca

founder at @vertesiahq, platform for AI-driven knowledge work.

Tokyo-to, Japan Katılım Ağustos 2008

508 Takip Edilen1.1K Takipçiler

Eric Barroca@ebarroca·1d

Context engineering is not prompt assembly. It's the durable infrastructure that prepares, scopes, ranks, governs, and audits what the model sees per turn. The window is the output. The layer is the work

English

Eric Barroca@ebarroca·1d

@mitchellh Very good point :) the issue with having to agent loop is that it influence itself as well. Curious: with the right architecture framing from the start or mid loop would have done it right?

English

Mitchell Hashimoto@mitchellh·2d

I've got an agent in a loop optimizing a renderer with the goal to minimize frame times (and tests to measure). It got times down from 88ms to 2ms and allocations down from ~150K to 500. Sounds good, right? Wrong. This is exactly why agent psychosis is a big fucking problem. As an experiment, I rewrote the Ghostty core render state in Go, with access to identically laid out data structures as Ghostty and the exact same validation tests. I made a purposely naive renderer (simple, correct, but slow). 88ms per frame with 150,000 allocations (horrendous, lol)! I then kickstarted a Ralph loop to bring the frame times down. I told it it can't modify input data structures or the public API or tests (they're correct), but it can do anything else it wants. It got to work. It has worked for about 4 hours. I've spent around $350 on this experiment so far. The results? 88ms => 1.5ms 150K allocs => ~500 allocs Incredible right? Nope. My hand-written renderer I ported has frame times (same benchmark) of ~20us (0.020ms) and 0 allocations in the update path. This is the problem with psychosis and lacking systems understanding. If you don't understand the system, you're going to accept that this is an incredible result. If you understand the system, you'll see better solutions immediately and can do roughly 75x better on throughput. The people who blindly trust agent output are in the former camp. They're sheeple, overdrinking from a fountain of mediocrity. Standard disclaimer: I use AI all the time. I like AI. The point I'm making is to not blindly accept results. Think. Analyze. Learn.

English

291

923

8.4K

713.5K

Eric Barroca@ebarroca·1d

It’s going to converge with panels and buttons for sure because it’s a lot of efficient and systematic. But likely panels and buttons made by AI to make dashboard a lot more configurable and adaptive

Jason Fried@jasonfried

I think you're going to see it's all going to converge back to screens and data and panels and buttons. People don't want to ask the same question over and over. They'll ask something, it'll be set up to show something, and that thing will be saved as something they can always look at. Stable pre-defined glances, not blank slates each time. Common questions will become buttons and panels again. Most people ask the same kinds of questions about what they work on most of the time. Having to start from scratch with the questions every time seems like a step backwards. Another way to put this: Questions are wonderful for a deeper dive, but not a daily drive. Not sure you're suggesting questions always, but the comparison screenshots looked that way.

English

Eric Barroca@ebarroca·1d

Yes, and the shape of that data matters. Mainframes had records. Cloud had databases. Agents need documents and data, together, prepared and addressable. Otherwise the model is reasoning on bag-of-words, not on the truth

MongoDB@MongoDB

In every tech transformation, something changes. But one thing has stayed constant. Our President & CEO Chirantan @cj_mongodb joined @HarryStebbings of @20vcFund to discuss why data remains the constant across every major technology shift — from mainframes, to cloud, to AI. Watch the full conversation: mongodb.social/6017B8VcEb

English

Eric Barroca@ebarroca·2d

Vector store ≠ context layer. The vector store loses what makes documents documents: structure, version, identity, governance. RAG is a useful component inside the context layer. It is not the layer itself

English

Eric Barroca@ebarroca·2d

Most attempts at agent governance reach for the leash: block this, restrict that, add a gate. A leash stops one action. A contract defines a whole space of allowed action — and refuses everything outside it. Process engines built for code did not need contracts. Process engines built for agents do. That is the architecture.

English

Eric Barroca@ebarroca·2d

@petarivanovv9 AI makes a lot of the other part a lot faster too: debugging, reviewing, testing (and making them), deciding, etc.

English

Petar Ivanov@petarivanovv9·3d

Writing code was never the bottleneck. In a typical sprint, an engineer spends maybe 20% of their time on net-new code. The rest is reading, debugging, reviewing, deciding, communicating, and operating. AI makes the 20% piece ten times faster and barely touches the other 80%, except by adding more code to review, which makes that part slower.

English

396

36K

Eric Barroca@ebarroca·2d

Documents are often the truth. Systems of record are representations. When the two disagree – at audit, in a claim, in a contract dispute – the document is what gets asked for. An agent that only reads the system row is reading a derivative, not the truth

English

Eric Barroca@ebarroca·2d

The death of SaaS was largely overestimated

Marc Benioff@Benioff

🍹Saasparillas flowing, saaspocalypse in full swing 😂 We’re getting sassy with unstoppable momentum. Salesforce just dropped an absolute monster Q1: 📈 Record revenue: $11.13B (+13% YoY) 💰 Operating cash flow: $6.7B 🤖 Agentforce ARR just crossed $1B Combined with Data 360 and Informatica, we’re now at $3.4B in AI + Data ARR. We’re not just talking about the agentic future — we’re delivering it. The #1 Agentic CRM, powering the shift to agentic enterprises at scale. The momentum is real. The future is unstoppable. 🔥 #Agentforce #Salesforce #AI #EnterpriseSoftware

English

Eric Barroca@ebarroca·3d

Yes. Access scope evolves with capability is the right shape. The harder problem is doing it at runtime: which scope this agent gets for this task, on whose behalf, with what credentials, and audited as whom. Sandboxing limits the blast radius. The contract decides what's allowed in the first place

Anthropic@AnthropicAI

New on the Engineering Blog: The access and permissions we grant agents should evolve with their capabilities. In our own products, we set these parameters through sandboxing, which limits the scope of any potentially destructive actions. Read more: anthropic.com/engineering/ho…

English

Eric Barroca@ebarroca·3d

Tests are the contract the agent has to honor. Without that contract – encoded somewhere the harness can check before accepting the change – the agent can refactor anything into anything. Red/green is the cheapest way to enforce the boundary

Simon Willison@simonw

(I'm firmly on team red/green TDD for agent code, I like having a test suite that protects against them breaking old features when they make new changes - simonwillison.net/guides/agentic…)

English

Eric Barroca@ebarroca·3d

Happy path is the demo. Steps 11 through 30 are the system – error handling, partial failures, retries, audit, governance, the human in the loop. That's where most of the work lives, and where most CEOs never see it when they play with an agent for an hour

Aaron Levie@levie

CEOs are uniquely prone to AI psychosis because they’re sufficiently distant from the last mile of work that still has to happen to generate most value with AI. So when they play with AI, they see the happy path results, often not considering the next 10 or 20 things that have to happen to get sustainable results from agents. “Look I made this awesome product prototype”. Yes but you didn’t have to review the code before it went into production and fix a bunch of issues. “Look I generated a contract”. Yes but you didn’t verify all the terms before it goes out to the counterparty and didn’t have to wire up all the past contracts to work with. The best thing you can do as a CEO is to use AI a *ton* to figure out the real implications of agents in the enterprise, and come out the other side with an appreciation for both the upside and the real work that goes into them.

English

Eric Barroca@ebarroca·4d

Finally close to touch my limit on codex 5.5 this week - and can’t see how to buy more capacity - so went back to Claude opus 5.7 as main model to see. Feels like such a downgrade. Can do things just as well but feels so slow and chatty. I want a machine not a friend

English

155

Eric Barroca@ebarroca·4d

Stop dropping PDFs into a chat window and calling it AI. Real enterprise content is not chat input. It is contracts, claims, policies, statements, case files, slides, tables — and each one needs intake, structured extraction, embeddings, lineage, permissions, versioning. The repository owns all of that. The chat window owns none of it. One is a tool, the other is infrastructure.

English

Eric Barroca@ebarroca·4d

Intelligence Is Contextual. Designing the enterprise context layer. The context window is not the context layer. The window is the output. The layer is the work — durable infrastructure around documents and data. ebarroca.substack.com/p/intelligence…

English

Eric Barroca@ebarroca·4d

Yes, and the artifact library matters as much as the UI. If an agent can generate a spreadsheet, report, diagram, or app, the system has to keep those artifacts addressable, versioned, and attached to the work that produced them

Tomasz Tunguz@ttunguz

Software's future: a harness to control AI-generated UIs + a context library of artifacts. The interface isn't going away — it's become malleable to whatever the user needs, when they need it. Read more: tomtunguz.com/plastic-user-i…

English

Eric Barroca@ebarroca·5d

The frontier keeps moving. The best reasoning model this quarter is not the best one next quarter. Same for visual, extraction, long-context. Each is a different leaderboard, each with a different winner. A serious agent platform should route by task, cost, latency, risk, and quality. Not by one vendor's model family.

English

Eric Barroca@ebarroca·5d

The question is not whether agents need apps like humans do. It is whether the app holds the state, permissions, tasks, and evidence the agent needs. Some apps become less UI and more substrate

Jason ✨👾SaaStr.Ai✨ Lemkin@jasonlk

When evaluating public and more mature software companies, just ask if an AI Agent would need them. Or not. Do agents need to Zoom? I don’t think so Do agents need to Salesforce? Actually, yes. At least for us. That’s in fact where they interact with each other. Do agents need to … use you?

English

Eric Barroca@ebarroca·5d

Full essay: ebarroca.substack.com/p/why-agent-fr…

English

Eric Barroca@ebarroca·5d

Agent. Tool. Memory. Planner. Executor. Router. Skill. Chain. Graph. Worker. Crew. Every concept gets a name. Every name gets an abstraction. At some point, you are not building a system anymore. You are assembling a vocabulary. The problem is not that the ideas are wrong. It is that they become framework objects instead of system responsibilities. So you get: hidden behavior implicit control flow hard-to-debug systems layers you have to read source code to understand The better framing is simpler: reasoning execution state tools boundaries The hard part is not naming things. The hard part is making the system predictable.

English

Keşfet

@mitchellh @petarivanovv9 @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA