고정된 트윗
Diego Ventura
20K posts

Diego Ventura
@colochef
Regional Director, SoCal & Bay Area @tryolabs.
Uruguay 가입일 Şubat 2009
5.6K 팔로잉5.7K 팔로워
Diego Ventura 리트윗함

Today we're announcing Day AI's $20M Series A led by @sequoia, and that we're now generally available.
We've spent 18 months building what we believe is the Cursor of CRM.
Here's what that means 🧵
English
Diego Ventura 리트윗함

We created a GitHub repo for all MCP at @Google.
Get info on our remote managed MCP servers, open source MCP servers, examples, and learning resources.
github.com/google/mcp

English
Diego Ventura 리트윗함

Major new research from Google and MIT.
"More agents is all you need" has become a mantra for AI developers. We know multi-agent systems can be effective, but we do this mostly based on heuristics.
The default approach to building complex AI systems today remains adding more agents, more coordination, more communication.
It would be helpful to have a more principled way to scale agentic systems.
This new research introduces the first quantitative scaling principles for agent systems, testing 180 configurations across three LLM families (OpenAI, Google, Anthropic) and four agentic benchmarks spanning financial reasoning, web navigation, game planning, and workflow execution.
The findings:
Multi-agent systems show an overall mean MAS improvement of -3.5% across all benchmarks, with massive variance ranging from +81% improvement to -70% degradation depending on task structure and architecture.
Three dominant effects emerge from the data:
The tool-coordination trade-off: tool-heavy tasks suffer disproportionately from multi-agent overhead. The efficiency penalty compounds as environmental complexity increases.
A task with 16 tools makes even the most efficient multi-agent architecture paradoxically less effective than a single agent.
The capability ceiling: once single-agent baselines exceed approximately 45% accuracy, coordination yields diminishing or negative returns. This is quantified as a statistically significant effect. Additional agents simply cannot overcome the coordination tax when baseline performance is already reasonable.
Architecture-dependent error amplification: independent multi-agent systems amplify errors 17.2x through unchecked propagation. Centralized coordination contains this to 4.4x via validation bottlenecks (these catch errors before propagation).
The presence or absence of inter-agent verification determines whether collaboration corrects or catastrophically compounds mistakes.
The performance heterogeneity is also interesting to look at:
- On parallelizable financial reasoning tasks, centralized multi-agent coordination achieves +80.9% improvement.
- On sequential planning tasks requiring constraint satisfaction, every multi-agent variant tested degraded performance by 39-70%.
- Decentralized coordination excels on dynamic web navigation (+9.2%) but provides essentially no benefit elsewhere.
The researchers derive a predictive model achieving cross-validated
𝑅^2=0.513 that correctly predicts the optimal architecture for 87% of held-out configurations. This model contains no dataset-specific parameters, enabling generalization to unseen task domains.
Overall, architecture-task alignment, not the number of agents, determines collaborative success. The research replaces heuristic guidance with quantitative principles: measure task decomposability, tool complexity, and baseline difficulty, then select a coordination structure accordingly.
Paper: arxiv.org/abs/2512.08296
Learn to build effective AI agents in my academy: dair-ai.thinkific.com

English
Diego Ventura 리트윗함

Announcing fully-managed, remote MCP servers, giving you direct access to Google and Google Cloud services.
Build apps that can easily call powerful tools like BigQuery & Google Maps. Learn more → goo.gle/4pyOBqy

English
Diego Ventura 리트윗함
Diego Ventura 리트윗함

@_rockt This is a fundamental flaw of sequential (auto-regressive) symbol prediction.
The fix is simple: don't do auto-regressive symbol prediction.
English
Diego Ventura 리트윗함
Diego Ventura 리트윗함
Diego Ventura 리트윗함
Diego Ventura 리트윗함
Diego Ventura 리트윗함
Diego Ventura 리트윗함
Diego Ventura 리트윗함
Diego Ventura 리트윗함
Diego Ventura 리트윗함

The problem with software estimates is that they're both entirely right and entirely wrong.
Yes there's a 3 week version of something. And a 6 week version. And a 4 month version. And a 12 month version. That's correct.
Yet, you'll almost always be wrong whichever you pick. Because estimates aren't walls — they're windows. Too easy to open and climb through to the next one.
The 3 week version will turn into the 6 week version will turn into the 12 week version. You can see right through.
Software that encourages you to estimate how long something will take makes it even worse. That software is part of there problem. You know which products I'm talking about.
So what to do instead?
Set an appetite.
A appetite is like a budget. Not "we think it'll take 4 weeks" but "we're only giving it 4 weeks." That's all we've got side aside for it. Then the team tasked with the work has to get creative and figure out the 4 week version of that feature. There is no 6, 8, 10, or 12 week version when the appetite is 4 weeks. Just like there's no $7,000 vacation when you only can afford a $2500 one. And you know how that ends up if you overspend.
Are there times when you need to give something another week? Maybe even two? Yes. There's some margin for that because it can only happen once per project, and it's commensurate with the time spent. You don't double the time, maybe you give it 10% more time if you need to. A little margin for error and reality is built in there. This isn't absolutist, this isn't fundamentalism.
And yes, there are times when things aren't completed within the time allotted, and there's no obvious, honest path to finishing it with a touch more time. In those cases the project dies, we internalize, and hope that doesn't happen again. It rarely does here at 37signals, but it has. It's part of the cost of doing things this way. The payoff is huge, the downside is limited — that's a tradeoff we can live with.
English
Diego Ventura 리트윗함

We haven't needed programmers since the advent of automated programming in the 1970s. Once you have a high level language like COBOL and a compiler to automatically turn it into actual code, business professionals can create their own applications without having to rely on specialized programmers.
English

Japan's Tiny Forests are Thriving in Britain - here's why bit.ly/3SyA93c
English

Why “Great” American Cities Have a Toilet Problem... bit.ly/4d3cK25
English
Diego Ventura 리트윗함

1) 🚨 Do you know what the silent threat to AI initiatives is?
bit.ly/3ykS1rq
English
















