Agents Applied

7 posts

Agents Applied banner
Agents Applied

Agents Applied

@AgentsApplied

The research edge for AI builders. Every week, we surface research that matters from the world's leading labs and ship it as a practical strategy newsletter.

Katılım Mart 2026
22 Takip Edilen2 Takipçiler
mscode07
mscode07@mscode07·
Drop your product 👇 Let's do some Marketing!!
English
160
5
71
6K
Agents Applied
Agents Applied@AgentsApplied·
Research-backed AI Strategy content. We dig through AI research papers every week, pull out what actually matters for builders and execs, and send it as a clean digest with all practical insights every Sunday morning. No 50-page papers. Just the insights, in your inbox. agentsapplied.com
English
0
0
0
57
Kaito
Kaito@KaiXCreator·
Builders on X What are you building right now? App. Startup. Side project. Content. I want more builders on my timeline. Drop what you're working on 👇🏻
Kaito tweet media
English
207
3
96
6.4K
Agents Applied
Agents Applied@AgentsApplied·
A team gave their AI agent the perfect tools. Hand-picked by human experts. Exactly the right skills for the job. It still failed. Not because the tools were wrong. Because there was nothing telling the agent how to use them together. This is the finding most enterprise AI leaders aren't ready for. Researchers at Shanghai AI Lab just published a controlled study on what actually determines AI agent output quality across an ecosystem of 280,000 publicly available skills. The results: → An agent given a flat list of 200,000 skills performed no better than one given no skills at all → An agent given perfect, hand-selected tools still lost when those tools were invoked without structure → The same skills, organized into a dependency-aware pipeline, produced categorically better output every time The controlling variable was never the tools. It was always the architecture. When a flat list of tools exceeds a few hundred items, the agent can't see most of them. And even when it finds the right ones, each tool runs blind. Tool B doesn't know what Tool A produced. There is no pipeline. There is only sequential guessing. The researchers built AgentSkillOS to fix this. A capability tree that makes 280,000 skills navigable. A dependency graph that coordinates execution before a single tool is invoked. ----- Want the full breakdown delivered to your inbox? Every week, Agents Applied takes the most important AI agent research coming out of labs like OpenAI, Atlassian, and DeepMind and translates it into strategy you can actually use. Free. No fluff. Straight to the point. Subscribe in 30 seconds at agentsapplied.com
Agents Applied tweet media
English
1
0
1
38
0xSero
0xSero@0xSero·
Agent-browser is the best CLI tool I have given to my agents. It lets them control the my browser, and all my electron apps (discord, vscode, slack, etc..) It barely consumes any tokens compared to things like playwright, has a great skill and the agents seem very comfortable with it Some workflows: - e2e testing and application by using it - setting up complicated sites for me - scanning through tons of messages Thanks Vercel github.com/vercel-labs/ag…
English
69
189
2.6K
148.4K
Agents Applied
Agents Applied@AgentsApplied·
The Enterprise Research Reports Your AI Is Getting Quietly Wrong Your RAG system returned an answer. It was fluent, professionally formatted, and cited three internal documents. It also missed a massive regulatory exposure buried in document 17 of 40, because the system never retrieved it. The report went to the CFO. A decision was made. This isn't a hallucination. The model didn't invent a fact. The failure happened upstream, in an architecture designed for simple lookup queries that is now being forced to do research-grade synthesis. "Analyze our Q3 vendor risks" is not a search query. It is a research brief. But single-pass RAG treats it like a search query every single time. Researchers at Atlassian just published a framework called ADORE (Adaptive Deep Orchestration for Research in Enterprise) that doesn't patch the RAG pipeline. It replaces the underlying architecture entirely. The results? On the DeepResearch Bench, ADORE ranked #1. In blind head-to-head evaluations against ChatGPT Deep Research on real business consulting tasks, it won 77.2% of the time and lost only 4.4%. ADORE works because it is built on one central insight: a research report is only as trustworthy as the evidence it was built from. Here is how the architecture enforces that: Memory-Locked Synthesis: The AI is structurally constrained. No source in the evidence locker = no sentence in the final report. The model physically cannot write a claim it cannot cite. Self-Auditing Retrieval: It doesn't stop after one pass. The system continuously audits whether each planned section has sufficient source coverage. If a section's evidence is thin, it automatically writes new queries and hunts for more before it starts writing. Section-Packed Grounding: Fixes the "lost in the middle" problem by only feeding the model the exact citations needed for the specific section it's writing, rather than dumping a 100-page document into the context window and hoping for the best. The workflow acts more like a research project manager than a chatbot. A Grounding Agent clarifies the vague brief before any retrieval begins. A Planning Agent builds a human-editable research outline. The Execution Agent runs iterative, self-auditing searches. Finally, the Report Generation Agent writes each section using only its locked evidence store. Imagine the impact on a financial services due diligence team. Instead of junior analysts spending 40 hours manually synthesizing documents, and senior partners wasting time fact-checking rogue AI claims, the system builds an undeniable audit trail. Reviewers check an existing, linked citation path rather than building one from scratch. The organizations that pull ahead in enterprise AI adoption won't be the ones running the most powerful models. They will be the ones that recognize trustworthy AI output is an architecture decision, not a model selection decision. Want the full paper analysis? Subscribe to Agents Applied for a weekly breakdown and analysis of the most practical, shortlisted research papers by leading labs like OpenAI, Atlassian, and DeepMind straight to your inbox. 🔗 agentsapplied.com Research source: "Orchestrating Specialized Agents for Trustworthy Enterprise RAG" @Atlassian #EnterpriseAI #AIStrategy #GenerativeAI
English
0
0
1
37