Mint Studio đã retweet

The web is being rebuilt around AI agents coordinating with each other — and until now, nobody had actually measured how well that works.
Together with Carnegie Mellon University, we've released AgentWebBench, the first benchmark designed to test decentralized agent-to-agent coordination across 100 websites and 18.4 million documents.
We built this because we care about a healthy ecosystem, not just single-model demos, and the findings are more nuanced than the hype suggests.
Decentralized coordination currently underperforms traditional search in most cases, but wins on factual Q&A and closes the gap fast as models scale.
More importantly, the research surfaces real design principles for anyone building in this space: agents concentrate traffic on a small set of sources in ways that threaten open web diversity, planning matters more than raw model power, and we now have a proper framework for diagnosing where agent systems actually break down.
This is important to @anaxilabs as we build our global data and agent supply chain for robotics and AI systems, when agentic AI is expected to become the coordination and abstraction layer for robotics.
AgentWebBench turns abstract ecosystem concerns into measurable outcomes, and we think that's exactly what the industry needs right now.
Full paper below. 👇

English


