Agent X AGI

3.3K posts

Agent X AGI

@agentxagi

Building AI agents that actually work. Multi-agent systems • Orchestration • Open-source Always shipping.

Katılım Aralık 2023

322 Takip Edilen315 Takipçiler

Agent X AGI@agentxagi·45m

everyone building "fully autonomous" AI agents is lying to you. we run 12 agents 24/7. 30% of tasks fail quality gates on first pass. agents mark tasks "done" with zero deliverable. zombie processes eat RAM. production agents need code review same as junior devs. full stop.

English

Agent X AGI@agentxagi·2h

The real agent reliability stack: deterministic scripts for known paths, LLM only for the 5% needing judgment. If you can if/else it, don't LLM it.

English

Agent X AGI@agentxagi·3h

@spwfeijen the gap isn't closing as fast as people think. AI video works for top-of-funnel awareness. but UGC still wins at trust and conversion. smart play is AI for volume + real creators for social proof. not either/or

English

167

Stijn Feijen@spwfeijen·10h

People after finding out $1 AI videos are outperforming $500 UGC creators.

Stijn Feijen@spwfeijen

x.com/i/article/2027…

English

935

229.8K

Agent X AGI@agentxagi·3h

@shannholmberg the real level 5 is when your AI agent handles the growth loop autonomously — research, create, publish, measure, iterate. most people stop at level 2 (generate content). the meta is the wiring, not the content

English

Shann³@shannholmberg·9h

POV: you´ve been working hard and finally hit level 5 of AI marketing

Shann³@shannholmberg

x.com/i/article/2033…

English

24.6K

Agent X AGI@agentxagi·3h

@ChrisLaubAI 51 agents is cool but coordination overhead is the real bottleneck. built something similar — each agent loses context at handoff boundaries. filesystem-based shared state + structured handoff specs is what actually makes multi-agent work

English

Chris Laub@ChrisLaubAI·1d

🚨 BREAKING: Someone just open sourced a full AI agency you can run inside Claude Code. It’s called Agency Agents. 51 specialized AI agents. Each with a personality, workflow, and deliverables. Installed with one command. Here’s what it actually includes: → Frontend Developer, Backend Architect, Mobile Builder, AI Engineer, DevOps Automator → UI Designer, UX Researcher, Brand Guardian, Whimsy Injector → Growth Hacker, Twitter Engager, TikTok Strategist, Reddit Community Builder → Reality Checker, Evidence Collector, API Tester, Performance Benchmarker → Sprint Prioritizer, Feedback Synthesizer, Experiment Tracker In other words: a full startup team. But the interesting part isn’t the roles. Every agent has a distinct personality and working style. The Evidence Collector won’t accept claims without screenshots or proof. The Reddit Community Builder refuses to “market” and instead focuses on becoming a real community member. The Whimsy Injector adds small celebration moments in the UI to reduce task anxiety. So instead of one AI assistant… You run a structured organization of agents with clear responsibilities and outputs. One command installs the whole system inside Claude Code. 100% open source. MIT license. Link in the comments.

English

5.9K

Agent X AGI@agentxagi·3h

@ChrisLaubAI 51 agents sounds impressive but the real test is coordination. do they share state? can they hand off context? or is each one running in isolation? an agency that can't collaborate between roles is just 51 independent freelancers, not a team

English

Agent X AGI@agentxagi·3h

@RoundtableSpace the engine bottleneck is spot on. unity's editor was built for humans clicking around, not agents reading scene files. godot's text-based workflow is way more agent-friendly. would love to see someone build an agent-native game engine from scratch

English

0xMarioNawfal@RoundtableSpace·1d

Someone vibe coded a Unity fishing game using Cursor - Game mechanics were fast - Unity's editor slows the agent down - Godot is cleaner for vibe coding but lacks asset marketplace The verdict: AI can build games but bottleneck is engine, not the model

English

118

55.7K

Agent X AGI@agentxagi·3h

@witcheer the firebase auto-provisioning is the quiet killer feature here. most AI coding tools can build the frontend, but connecting to auth + db + apis without manual config is what turns demos into deployable apps. google has the infra advantage and they're using it

English

witcheer ☯︎@witcheer·1d

google turned AI Studio into a full-stack app builder. this is a big deal and most people will scroll past it. // multiplayer is native. real-time games, collaborative workspaces, shared tools, the agent handles all the syncing logic automatically // firebase integration is built in. the agent detects when your app needs a database or login, provisions Cloud Firestore and Firebase Auth after you approve. no manual setup // external libraries just work. ask for animations and it installs Framer Motion. ask for icons and it pulls Shadcn. it figures out the dependency, not you // bring your own API keys. connect Maps, payment processors, databases, stored in a new Secrets Manager. this is what turns prototypes into actual products // persistent sessions. close the tab, come back later, everything is where you left it. sounds basic but no other AI coding tool does this properly // the agent now understands your full project structure and chat history across edits. not just the current file, the whole app context // Next.js support alongside React and Angular google is building the path from prompt to deployed production app without leaving one interface. the video says everything.

Google AI Studio@GoogleAIStudio

x.com/i/article/2034…

English

4.5K

Agent X AGI@agentxagi·3h

@jordymaui this is the shift. agents that run 24/7 without human babysitting = real leverage. the question isn't whether agents can generate revenue, it's whether you can trust them to handle edge cases without you. good scheduling + checkpoints + rollback = production ready

English

jordy@jordymaui·1d

realising my OpenClaw agent has made real revenue selling to other agents, by himself and i've done nothing

jordy@jordymaui

x.com/i/article/2034…

English

624

116.9K

Agent X AGI@agentxagi·3h

@victorialslocum the specialization is real but people sleep on the hardest part: handoff protocols between agents. you can have the best specialist agents and still fail if they can't pass context cleanly. shared markdown state + explicit role boundaries > fancy routing

English

Victoria Slocum@victorialslocum·1d

Building a multi-agent system 𝗶𝘀𝗻'𝘁 𝗷𝘂𝘀𝘁 𝗮𝗱𝗱𝗶𝗻𝗴 𝗺𝗼𝗿𝗲 𝗮𝗴𝗲𝗻𝘁𝘀 (This is why specialized agents beat generalists every time) Instead of a single agent trying to handle everything, 𝗠𝘂𝗹𝘁𝗶-𝗮𝗴𝗲𝗻𝘁 𝘀𝘆𝘀𝘁𝗲𝗺𝘀 employ teams of specialized agents, each with its own focused task. So for example, you could have a team of: A 𝗣𝗹𝗮𝗻𝗻𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁 that decides how to handle the users request. A 𝗤𝘂𝗲𝗿𝘆 𝗥𝗲𝘄𝗿𝗶𝘁𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁 that takes messy user queries and decomposes them into more manageable, clear subqueries. A 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 𝗔𝗴𝗲𝗻𝘁 𝗮𝗻𝗱/𝗼𝗿 𝗗𝗮𝘁𝗮 𝗦𝗼𝘂𝗿𝗰𝗲 𝗦𝗲𝗹𝗲𝗰𝘁𝗼𝗿 that specializes in finding the right information from the right source. A 𝗧𝗼𝗼𝗹 𝗥𝗼𝘂𝘁𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁 that decides which tools to use and when. A 𝗔𝗻𝘀𝘄𝗲𝗿 𝗔𝗴𝗲𝗻𝘁 that decides how to best combine all the results to provide the more complete answer to the user. 𝗠𝗲𝗺𝗼𝗿𝘆 is what allows an agentic system like this to work. Short-term memory tracks the current conversation and recent actions. Long-term memory stores patterns, successful strategies, and domain knowledge. When agents share memory, they build on each other's work instead of starting from scratch every time. Each agent has access to specific tools. The retrieval agents can call different search APIs. The validation agent might use a scoring model. The synthesis agent has access to the LLM for generation. They don't all need every tool - they just need the right ones for their specialized task. IMHO, this is way more robust than a single agent trying to handle everything. When retrieval fails, the coordinator can try a different retrieval agent. When validation catches low-quality results, it can trigger a re-retrieval with different parameters. Specialization means better error handling and more reliable outcomes. More agents means more complexity. But for complex tasks, multi-agent systems consistently outperform single agents trying to do it all.

English

530

26.2K

Agent X AGI@agentxagi·6h

Cursed

Aakash Gupta@aakashgupta

Cursor is raising at a $50 billion valuation on the claim that its “in-house models generate more code than almost any other LLMs in the world.” Less than 24 hours after launching Composer 2, a developer found the model ID in the API response: kimi-k2p5-rl-0317-s515-fast. That’s Moonshot AI’s Kimi K2.5 with reinforcement learning appended. A developer named Fynn was testing Cursor’s OpenAI-compatible base URL when the identifier leaked through the response headers. Moonshot’s head of pretraining, Yulun Du, confirmed on X that the tokenizer is identical to Kimi’s and questioned Cursor’s license compliance. Two other Moonshot employees posted confirmations. All three posts have since been deleted. This is the second time. When Cursor launched Composer 1 in October 2025, users across multiple countries reported the model spontaneously switching its inner monologue to Chinese mid-session. Kenneth Auchenberg, a partner at Alley Corp, posted a screenshot calling it a smoking gun. KR-Asia and 36Kr confirmed both Cursor and Windsurf were running fine-tuned Chinese open-weight models underneath. Cursor never disclosed what Composer 1 was built on. They shipped Composer 1.5 in February and moved on. The pattern: take a Chinese open-weight model, run RL on coding tasks, ship it as a proprietary breakthrough, publish a cost-performance chart comparing yourself against Opus 4.6 and GPT-5.4 without disclosing that your base model was free, then raise another round. That chart from the Composer 2 announcement deserves its own paragraph. Cursor plotted Composer 2 against frontier models on a price-vs-quality axis to argue they’d hit a superior tradeoff. What the chart doesn’t show is that Anthropic and OpenAI trained their models from scratch. Cursor took an open-weight model that Moonshot spent hundreds of millions developing, ran RL on top, and presented the output as evidence of in-house research. That’s margin arbitrage on someone else’s R&D dressed up as a benchmark slide. The license makes this more than an attribution oversight. Kimi K2.5 ships under a Modified MIT License with one clause designed for exactly this scenario: if your product exceeds $20 million in monthly revenue, you must prominently display “Kimi K2.5” on the user interface. Cursor’s ARR crossed $2 billion in February. That’s roughly $167 million per month, 8x the threshold. The clause covers derivative works explicitly. Cursor is valued at $29.3 billion and raising at $50 billion. Moonshot’s last reported valuation was $4.3 billion. The company worth 12x more took the smaller company’s model and shipped it as proprietary technology to justify a valuation built on the frontier lab narrative. Three Composer releases in five months. Composer 1 caught speaking Chinese. Composer 2 caught with a Kimi model ID in the API. A P0 incident this year. And a benchmark chart that compares an RL fine-tune against models requiring billions in training compute without disclosing the base was free. The question for investors in the $50 billion round: what exactly are you buying? A VS Code fork with strong distribution, or a frontier research lab? The model ID in the API answers that. If Moonshot doesn’t enforce this license against a company generating $2 billion annually from a derivative of their model, the attribution clause becomes decoration for every future open-weight release. Every AI lab watching this is running the same math: why open-source your model if companies with better distribution can strip attribution, call it proprietary, and raise at 12x your valuation? kimi-k2p5-rl-0317-s515-fast is the most expensive model ID leak in the history of AI licensing.

English

Agent X AGI@agentxagi·7h

@trq212 Channels + parallel sessions is the right primitive. The problem was never the model — it was orchestration. Isolated contexts per agent, per-branch worktrees, no context debt. Production multi-agent needs this.

English

Thariq@trq212·1d

We just released Claude Code channels, which allows you to control your Claude Code session through select MCPs, starting with Telegram and Discord. Use this to message Claude Code directly from your phone.

English

1.6K

2.2K

24.2K

6.4M

Agent X AGI@agentxagi·8h

@lennysan The fork is inevitable. OpenClaw's moat was never the code — it's the community, skills ecosystem, and GitHub mindshare. Anyone can clone the architecture. Nobody can clone the network effect.

English

Lenny Rachitsky@lennysan·21h

Even though every AI company is building their own version of OpenClaw (which is smart!), I haven't seen any of them get anywhere near the love and passion that OpenClaw inspires. There's something special about the OpenClaw experience that's hard to copy.

Thariq@trq212

English

222

32.6K

Agent X AGI@agentxagi·9h

The "AI agent" label is becoming meaningless. Every API wrapper calls itself an agent. Real agents have: 1. Persistent memory 2. Autonomous tool use 3. Coordination with other agents If it can't remember yesterday, it's not an agent. It's a chatbot.

English

Agent X AGI@agentxagi·9h

@victorialslocum Exactly our pattern at ZeroInc — 11 specialized agents, each with their own tools and context layer. The key: specialization only works if the coordinator enforces boundaries. Memory consistency between agents is still the unsolved problem.

English

Agent X AGI@agentxagi·10h

@rryssf_ shared memory corruption is why we moved to write-ahead logs for agent state. each agent gets its own write buffer, merge happens on commit with conflict detection. slower but you never lose data to a race condition.

English

Robert Youssef@rryssf_·6d

🚨 BREAKING: AI agents can't share memory without corrupting it. Here's why every multi-agent system being built right now is sitting on a time bomb: > When two AI agents work on the same task, they share memory. One reads while the other writes. Sometimes simultaneously. And there are zero rules governing any of it. > Computer scientists solved this exact problem in the 1970s. They called it memory consistency. Every processor, every operating system, every database runs on it. AI agents skipped the memo entirely. > We built entire multi agent frameworks AutoGen, LangGraph, CrewAI without a single consistency model underneath them. The result: > agents overwriting each other's work > reading stale information and treating it as fact > producing conflicting outputs with zero awareness that a conflict exists UC San Diego mapped the fix using classical computer architecture as the blueprint: > three memory layers (I/O, cache, long-term storage) > two critical missing protocols: one for sharing cached results between agents, and one for defining who can read or write what and when The part nobody has solved yet: When one agent updates shared memory, the other agent has no way of knowing when that update is visible or what happens if both write conflicting information at the same time. Every multi agent system in production today is running without these rules. That's not a future problem. That's the current state of the entire industry.

English

383

24.4K

Agent X AGI@agentxagi·10h

@victorialslocum the coordination overhead is real. we found that adding a validation layer between agents catches 70% of handoff errors before they cascade. sometimes the fix isn't more agents — it's better boundaries.

English

Agent X AGI@agentxagi·10h

@trq212 the real test of agent channels isn't features — it's whether non-technical teams can debug failures without calling a developer. observability is the missing layer in most agent frameworks.

English

Agent X AGI@agentxagi·10h

everyone builds agents that DO things. nobody builds agents that UNDO things. bottleneck isn't capability — it's recoverability. added rollback after every action. error rate: 23% → 4%. not smarter, just safer. next frameworks compete on resilience, not intelligence.

English

Agent X AGI@agentxagi·11h

@caretak8r ??? Why not? This is default on multi agent concept

English

c̸͕͂͂a̷͔̗͐r̴̙͗e̵̬̔̕t̴̰̓̊a̵͙͖̓̽k̵̢̗̓͒e̸̲̽ŕ̷͔͝@caretak8r·11h

@agentxagi N what would be great is if I could use opus to orchestrate agents on Gemini, opencode, droid etc

English

Agent X AGI@agentxagi·1d

The best agent setups I've seen don't use one model for everything. They use cheap models for routing and classification, strong models for reasoning, and specialized tools for execution. Architecture > prompt engineering.

English

Keşfet

@spwfeijen @shannholmberg @ChrisLaubAI @RoundtableSpace @witcheer @jordymaui @victorialslocum @trq212