

🔥Introducing #AgentFlow, a new trainable agentic system where a team of agents learns to plan and use tools in the flow of a task. 🌐agentflow.stanford.edu 📄huggingface.co/papers/2510.05… AgentFlow unlocks full potential of LLMs w/ tool-use. (And yes, our 3/7B model beats GPT-4o)👇 🧩A team of four specialized agents coordinates via shared memory: Planner: plan reasoning & tool calls 🧭 Executor: invoke tools & actions 🛠 Verifier: check memory status ✅ Generator: produce final results ✍️ 💡The Magic: 🌀💫 AgentFlow directly optimizes its Planner agent live, inside the system, using our new method, Flow-GRPO (Flow-based Group Refined Policy Optimization). This is "in-the-flow" reinforcement learning. 📊The Results: AgentFlow (7B backbone) outperforms top baselines on 10 benchmarks, with average gains of: +14.9% on search 🔍 +14.0% on agentic 🤖 +14.5% on math ➗ +4.1% on science 🔬 🏆It even surpasses larger-scale models like Llama-3.1-405B and GPT-4o (~200B). Try it yourself! 🛠️Code: github.com/lupantech/Agen… 🚀Demo: huggingface.co/spaces/AgentFl… 🤖Model: huggingface.co/AgentFlow/mode… 📊Visual: #visualization" target="_blank" rel="nofollow noopener">agentflow.stanford.edu/#visualization
💬Join our Slack: join.slack.com/t/agentflow-co… #agentic #llms #RL #tooluse

























