YuanLab.ai

@YuanAI_Lab

Katılım Ocak 2024

8 Takip Edilen814 Takipçiler

YuanLab.ai@YuanAI_Lab·31 Mar

🦞 ClawManager just shipped an AI Gateway update! 🔐 Model Management — regular vs secure model tiers, per-model endpoint & pricing config 📋 Audit & Trace — every request, response, routing decision & risk hit — fully logged and traceable 💰 Cost Accounting — token usage tracked in real time, costs estimated automatically 🚦 Risk Control — auto-block or route to secure model before the request hits 🌐 15 providers supported, including: OpenAI · Moonshot · MiniMax · DeepSeek · Ollama · OpenRouter · Local · and more Kubernetes-native. MIT licensed. Open source. Try it out and drop us a ⭐ if you find it useful! 🔗 github.com/Yuan-lab-LLM/C… #OpenClaw #AIGateway #OpenSource #Kubernetes

English

134

YuanLab.ai@YuanAI_Lab·23 Mar

🚀 We just open-sourced ClawManager — the world's first platform purpose-built for batch deployment and cluster-scale operations of @OpenClaw! ✨ What it does: 🖥 Batch deploy desktop instances across users at scale ☸️ Kubernetes-native, with full instance lifecycle management 🔐 Secure proxy access with token auth + WebSocket forwarding 🧠 OpenClaw memory & preferences backup/migration 📊 Cluster resource overview & admin dashboard 👥 Multi-tenant, multi-runtime (OpenClaw, Webtop, Ubuntu, Debian...) 🌍 5 languages supported Built with Go + React 19 + Kubernetes. MIT licensed. 🔗 GitHub: github.com/Yuan-lab-LLM/C… Stars & contributions welcome! ⭐ #OpenSource #Kubernetes #OpenClaw #DevTools #CloudNative

English

208

YuanLab.ai@YuanAI_Lab·11 Mar

🚀Yuan3.0 Ultra: A Trillion-Parameter MoE Model Built for Enterprise AI 🏢Enterprise applications require more than chatbots—real-world workflows demand AI that can efficiently execute multi-step tasks. Yuan3.0 Ultra is the core engine powering enterprise AI agent deployment. The Innovation: LAEP (Learning-based Adaptive Expert Pruning) Unlike traditional MoE models that sacrifice accuracy for speed, LAEP works with the way experts naturally specialize—pruning redundancies without disrupting functional structure. Results: ✅️33% fewer parameters (1515B → 1010B) ✅️49% faster training ✅️Only 6.8% active parameters per token (68.8B) ✅️14% shorter outputs, 16% higher accuracy Built for Enterprise: Document analysis, multi-source RAG, and intelligent tool selection that filters invalid requests. 👐Open source — weights + technical report available now. github.com/Yuan-lab-LLM/Y…

English

33.1K

YuanLab.ai@YuanAI_Lab·4 Mar

🚀Trillion parameters. Zero compromises. 100% open source. 🔥Introducing Yuan 3.0 Ultra — our flagship multimodal MoE foundation model, built for stronger intelligence and unrivaled efficiency. ✅️Efficiency Redefined: 1010B total / 68.8B activated params. Our groundbreaking LAEP (Layer-Adaptive Expert Pruning) algorithm cuts model size by 33.3% and lifts pre-training efficiency by 49%. ✅️Smarter, Not Longer Thinking: RIRM mechanism curbs AI "overthinking" — fast, concise reasoning for simple tasks, full depth for complex challenges. ✅️Enterprise-Grade Agent Engine: SOTA performance on RAG & MRAG, complex document/table understanding, multi-step tool calling & Text2SQL, purpose-built for real-world business deployment. 📂Full weights (16bit/4bit), code, technical report & training details — all free for the community. 👉Learn More: github.com/Yuan-lab-LLM/Y…

English

136

1.2K

115.2K

YuanLab.ai@YuanAI_Lab·5 Oca

Overthinking is quietly becoming the biggest hidden cost in LLM deployment. Yuan3.0 Flash tackles this with RAPO + RIRM — not by forcing shorter outputs, but by teaching models when to stop thinking. 📷Explore now: github.com/Yuan-lab-LLM/Y… ✨ What’s different: ✅ RIRM penalizes useless post-answer reflection, cutting up to 70%+ wasted tokens ✅ RAPO redesigns RL training to balance reasoning quality, efficiency, and stability ✅ 50%+ training efficiency gain, even on large-scale MoE models The result: faster iteration, lower inference cost, and reliable performance across real enterprise workloads — RAG, table analysis, and long-document reasoning. Yuan3.0 Flash isn’t about thinking more. It’s about thinking precisely enough — and stopping at the right moment.

English

27.5K

YuanLab.ai@YuanAI_Lab·4 Oca

AI large #model overthinks—even after nailing the right answer? No more redundant verification without new evidence. Yuan3.0 Flash’s RIRM (Reflection Inhibition Reward Mechanism) is the breakthrough method that holds models accountable not just for getting answers right, but for knowing when to stop. Repeated logic or post-answer second-guessing is treated as low-value reflection—and suppressed during #training. ✅ 75% less reasoning token usage ✅ Stable—or even improved—accuracy ✅ 2x faster responses (no wasted compute on overthinking) From "endless overthinking" to "stop when correct"—this is how LLMs move from impressive demos to models that truly scale in production.

English

17.3K

YuanLab.ai@YuanAI_Lab·31 Ara

Announcing Yuan 3.0 Flash — an open-source, multimodal #LLM that’s “Higher Intelligence with Fewer Tokens” Explore the next-gen efficient LLM now: github.com/Yuan-lab-LLM/Y… ✨ 40B MoE (only 3.7B active), RIRM cuts 75% inference tokens—higher accuracy, lower cost ✨ Enterprise-ready: Handles text, images, tables & docs, RAG, 128K context, perfect“needle-in-haystack”recall ✨ Powered by the RAPO RL algorithm and enterprise-grade datasets, built for real-world deployment. ✨ Fully open-source: #Model weights, technical reports, and training frameworks — free for industry & research! Prove that bigger models aren’t the only path to smarter #AI!

English

57.2K

Keşfet

@OpenClaw @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine