OpenBMB (@OpenBMB) - Profil Twitter | Zamantika Mersobahis Locabet

Tweet Disematkan

OpenBMB@OpenBMB·24 Şub

🚀 SOAR 2026 is Heating Up! 🚀 With 150+ teams already registered, we are pushing the limits of #MiniCPM-SALA (the world's 1st Sparse + Linear Hybrid Model) on #SGLang & #NVIDIA RTX PRO GPU. It's time to break the million-token inference bottleneck! 💥 @lmsysorg @NVIDIAAIDev 🏆 The Prize Pool: Over $100,000 in total rewards 💰 ⌨️ Challenge Mission: Optimize sparse operator fusion and extreme KV-cache efficiency within the SGLang framework to achieve high-throughput, long-context inference on consumer GPUs. 📅 Key Timelines: 🔹 Feb 25, 12:00 PM (UTC+8): Evaluation Channel & Real-time Leaderboard LIVE! 📊 (Dataset & Eval Demo released) 🔹 March 4: First #WeeklyWinner announcement! 🚀 Join the global sprint and define the next performance baseline for efficient inference. 🔗 Register Now: soar.openbmb.cn/en/

English

1

4

20

5.2K

OpenBMB me-retweet

Dhaval Makwana@heyDhavall·17h

This is a smart direction for agent infrastructure. Not everything should go to the cloud. EdgeClaw adds a proper local decision layer for privacy + cost. Feels like something OpenClaw needed. Check it out: github.com/OpenBMB/EdgeCl…

OpenBMB@OpenBMB

(1/2)🦞 Using @openclaw but worried about sending sensitive data to the cloud? 🤔 Meet #EdgeClaw — a dedicated Local Routing Layer for #OpenClaw that handles data sensitivity and task complexity on the edge. 💻 It’s a drop-in enhancement that reactivates your local hardware to act as a Privacy Guard & Cost Judge. 3-Tier Security (Regex + Local LLM Engine) 🟢 S1 (Safe): Transparent passthrough to cloud. 🟡 S2 (Sensitive): On-device PII redaction before forwarding. 🔴 S3 (Private): 100% Local inference. Cloud only sees a 🔒 placeholder. 🧠 #LLM-as-Judge Routing Classifies requests locally and routes to the ideal model tier (e.g., #MiniCPM ➡️ #GPT-4o/#Claude), optimizing resource allocation and reducing unnecessary cloud dependency. 📉 Learn more🔗 github.com/openbmb/edgecl… #EdgeAI #LLM #Privacy #EdgeClaw #OpenClaw

English

2

9

12

765

OpenBMB@OpenBMB·18h

(2/2) #EdgeClaw operates via a Hook mechanism with zero changes to your existing @openclaw business logic. 🚀 Quick Start Guide: 1️⃣ Enable GuardClaw plugin in your `openclaw.json`. 2️⃣ Register Local Guard #Agent (via Ollama/vLLM) for on-device detection. 3️⃣ Activate Routers to enable cost-aware tiering. 4️⃣ Run: `pnpm openclaw gateway run` 🔒 Now your sensitive MEMORY-FULL.md stays entirely local, always. Explore it now: 🔗 github.com/openbmb/edgecl…

English

1

0

13

533

OpenBMB@OpenBMB·18h

(1/2)🦞 Using @openclaw but worried about sending sensitive data to the cloud? 🤔 Meet #EdgeClaw — a dedicated Local Routing Layer for #OpenClaw that handles data sensitivity and task complexity on the edge. 💻 It’s a drop-in enhancement that reactivates your local hardware to act as a Privacy Guard & Cost Judge. 3-Tier Security (Regex + Local LLM Engine) 🟢 S1 (Safe): Transparent passthrough to cloud. 🟡 S2 (Sensitive): On-device PII redaction before forwarding. 🔴 S3 (Private): 100% Local inference. Cloud only sees a 🔒 placeholder. 🧠 #LLM-as-Judge Routing Classifies requests locally and routes to the ideal model tier (e.g., #MiniCPM ➡️ #GPT-4o/#Claude), optimizing resource allocation and reducing unnecessary cloud dependency. 📉 Learn more🔗 github.com/openbmb/edgecl… #EdgeAI #LLM #Privacy #EdgeClaw #OpenClaw

English

40

25

56

89.9K

OpenBMB@OpenBMB·1d

If we’re talking about Dense Models (non-MoE), what is the most ideal parameter size for daily on-device use in 2026?

English

0

1

190

OpenBMB@OpenBMB·1d

💻 As on-device NPU and memory efficiency hit a tipping point, we’re re-evaluating the "Sweet Spot" for localized intelligence. 📱 🚀We want to hear from the dev community: If we’re talking about Dense Models (non-MoE), what is the most ideal parameter size for daily on-device use in 2026? 🧠 🙌Help us benchmark the future of #EdgeAI! #LLM #LocalLLM #OnDeviceAI #AIInfra #MobileAI #AppleSilicon

English

3

0

9

599

OpenBMB@OpenBMB·2d

Generative models for speech synthesis face a fundamental trade-off: discrete tokens ensure stability but sacrifice expressivity, while continuous signals retain acoustic richness but suffer from error accumulation due to task entanglement.⚖️ #VoxCPM achieves state-of-the-art (#SOTA) levels in naturalness, timbre similarity, and prosodic expressiveness with a parameter size of 0.5B. Today, we present this research from @TsinghuaNLP (#OpenBMB member) and @ModelBest2022 : A novel framework that resolves this dilemma through hierarchical semantic-acoustic modeling with semi-discrete residual representations, unlocking context-aware speech generation and true-to-life voice cloning. 🤗 Paper: openreview.net/forum?id=h5KLp… 💻 demo: voxcpm.github.io/VoxCPM-demopag… Why it matters: 1️⃣ No External Tokenizers Needed: Traditional pipelines rely on pre-trained discrete speech tokenizers, creating a semantic-acoustic divide. VoxCPM eliminates this dependency entirely. The entire architecture is trained end-to-end under a simple diffusion objective. 🔓 2️⃣ "Skeleton First, Flesh Later" Hierarchy: The model induces natural specialization through a differentiable quantization bottleneck. A Text-Semantic Language Model (TSLM) generates semantic-prosodic plans, while a Residual Acoustic Model (RALM) recovers fine-grained acoustic details. This structural separation allows the model to draw a stable semantic skeleton before rendering the acoustic flesh.📈 3️⃣ SOTA Performance & Efficiency: Trained on over 1 million hours of speech, our 0.5B-parameter model achieves state-of-the-art zero-shot TTS performance among open-source systems. It achieves an English word error rate of 1.85% and a Chinese character error rate of 0.93%. 🚀 VoxCPM offers a robust, unified solution for high-quality, expressive speech generation, advancing TTS from merely "talking" to truly "speaking" with natural expression. #AI #THUNLP #TTS #SpeechSynthesis #GenerativeAI

English

1

3

13

1.1K

OpenBMB@OpenBMB·2d

Stop wasting time on env configs. ☕️ 🦞Our new @OpenClaw skill for #MiniCPM-o 4.5 automates the entire deployment pipeline—from vLLM serving to llama.cpp-omni. It’s all about getting your multimodal model running with zero friction. 🖱️Deploy here: clawhub.ai/ZMXJJ/minicpmo… #LLMOps #AI #MiniCPM #Clawhub

English

1

4

18

952

OpenBMB me-retweet

vLLM@vllm_project·3d

Thanks @OpenBMB! MiniCPM-o 4.5 — a 9B omnimodal model with real-time vision, speech, and text — now runs natively in vLLM v0.17.0. 🎉

OpenBMB@OpenBMB

✨vLLM v0.17.0 is out with support for #MiniCPM-o 4.5! 🚀 Now you can serve the latest 9B #omnimodal model with vLLM’s high-throughput serving engine. For developers, this means scaling real-time, full-duplex conversations (vision, speech, and text) is now production-ready. @vllm_project 👏Check the release: github.com/vllm-project/v… #LLM #vLLM #MiniCPM #OpenSource

English

1

3

31

5.9K

OpenBMB me-retweet

Aaliya@aaliya_va·3d

Most AI tools give you one chatbot. OpenMAIC shows something bigger. It simulates a full AI classroom — teachers, students, and planning agents all working together in real time. A great example of how multi-agent AI can work. Try the demo and give it a ⭐ github.com/THU-MAIC/OpenM…

OpenBMB@OpenBMB

From lab to open-source: A new milestone for AI-driven education. 🎓 🤗 We’ve been closely following the MAIC project at Tsinghua University, and we’re thrilled to see it now open-sourced as #OpenMAIC. ✨ This isn't just another chatbot; it takes Multi-Agent orchestration to the next level by building a fully interactive classroom where AI instructors and peers collaborate in real-time. What makes it technically impressive: 🛠️ Complex Orchestration: Leveraging #LangGraph to manage spontaneous interactions—like #AI students "raising hands" during a live lecture. 🧠 Structured Planning: A dedicated "Plan Agent" that transforms raw PDFs into coherent, logically sequenced pedagogical flows. 💻 Beyond Text: A masterclass in GenUI implementation, featuring synchronized TTS, laser pointers, and real-time whiteboard demonstrations. 🥳 If you’re building complex, multi-modal #Agent workflows, this repo is a treasure trove of engineering insights. 🖥️Explore the project: github.com/THU-MAIC/OpenM… 📰 Read the research: jcst.ict.ac.cn/article/doi/10…

English

7

5

35

6.6K

OpenBMB@OpenBMB·3d

From lab to open-source: A new milestone for AI-driven education. 🎓 🤗 We’ve been closely following the MAIC project at Tsinghua University, and we’re thrilled to see it now open-sourced as #OpenMAIC. ✨ This isn't just another chatbot; it takes Multi-Agent orchestration to the next level by building a fully interactive classroom where AI instructors and peers collaborate in real-time. What makes it technically impressive: 🛠️ Complex Orchestration: Leveraging #LangGraph to manage spontaneous interactions—like #AI students "raising hands" during a live lecture. 🧠 Structured Planning: A dedicated "Plan Agent" that transforms raw PDFs into coherent, logically sequenced pedagogical flows. 💻 Beyond Text: A masterclass in GenUI implementation, featuring synchronized TTS, laser pointers, and real-time whiteboard demonstrations. 🥳 If you’re building complex, multi-modal #Agent workflows, this repo is a treasure trove of engineering insights. 🖥️Explore the project: github.com/THU-MAIC/OpenM… 📰 Read the research: jcst.ict.ac.cn/article/doi/10…

English

58

42

144

140.9K

OpenBMB@OpenBMB·4d

✨vLLM v0.17.0 is out with support for #MiniCPM-o 4.5! 🚀 Now you can serve the latest 9B #omnimodal model with vLLM’s high-throughput serving engine. For developers, this means scaling real-time, full-duplex conversations (vision, speech, and text) is now production-ready. @vllm_project 👏Check the release: github.com/vllm-project/v… #LLM #vLLM #MiniCPM #OpenSource

vLLM@vllm_project

🚀 vLLM v0.17.0 is here! 699 commits from 272 contributors (48 new!) This is a big one. Highlights: ⚡ FlashAttention 4 integration 🧠 Qwen3.5 model family with GDN (Gated Delta Networks) 🏗️ Model Runner V2 maturation: Pipeline Parallel, Decode Context Parallel, Eagle3 + CUDA graphs 🎛️ New --performance-mode flag: balanced / interactivity / throughput 💾 Weight Offloading V2 with prefetching 🔀 Elastic Expert Parallelism Milestone 2 🔧 Quantized LoRA adapters (QLoRA) now loadable directly

English

1

7

39

9.3K

OpenBMB@OpenBMB·13 Mar

✨MiniCPM-SALA lands at #11 on the HELMET leaderboard! 🚀 HELMET (by Princeton NLP) moves beyond simple "Needle-in-a-Haystack" tests, evaluating models on 7 diverse, real-world categories like RAG, ICL, and Long-doc QA up to 128K tokens. 🥗 MiniCPM-SALA’s hybrid architecture (25% Sparse + 75% Linear Attention) proves that efficiency doesn’t have to trade off performance. By outperforming #Mistral Megabeam, #Gemma 3 27B, and #GPT-4, it demonstrates: 🔹 High-fidelity modeling via InfLLM-v2 🔹 3.5x inference speedup & lower KV-cache overhead 🔹 1M+ token scaling with HyPE encoding A major step forward for efficient, million-token context modeling.🙌 🥗 huggingface.co/openbmb/MiniCP… 🔗 #leaderboard" target="_blank" rel="nofollow noopener">princeton-nlp.github.io/HELMET/#leader… #LLM #MiniCPM #LongContext #AI #MachineLearning #SALA

English

0

2

25

1.4K

OpenBMB@OpenBMB·12 Mar

When #LLMs surpass human experts in math and coding, who provides the ground truth? How can we train models when human supervision becomes the bottleneck? 🤔 Today, we dive into Unsupervised RLVR—a comprehensive new study by @TsinghuaNLP (an #OpenBMB member) alongside researchers from Shanghai AI Lab, XJTU, and UIUC. This paper systematically answers exactly how far unsupervised verifiable rewards can scale LLM training. 🤗 Paper: huggingface.co/papers/2603.08… 📄 arXiv: arxiv.org/abs/2603.08660 💻 Code: github.com/PRIME-RL/TTRL Why it matters: 1️⃣ The "Sharpening" Illusion: We mathematically prove that methods relying on intrinsic rewards (like model confidence or consistency) don't actually learn new knowledge. They merely "sharpen" the model's initial preference, converging into a deterministic policy. 🪒 2️⃣ The "Rise and Fall" Trap: Because of this sharpening effect, intrinsic RLVR inevitably faces a collapse. It improves initially when the model's intuition is right, but quickly leads to "reward hacking" where proxy rewards rise while true accuracy plummets. 📉 3️⃣ A New Metric for RL Potential: We introduce "Model Collapse Steps" (the steps it takes for reward accuracy to drop below 1%). This surprisingly stable metric acts as a highly efficient predictor of a model's prior capacity and trainability for RL! ⏱️ 4️⃣ The Path Forward: While intrinsic rewards hit a wall, our exploration of extrinsic rewards (like self-verification) shows true promise in breaking these limitations, paving the way for sustainable LLM self-evolution. 🚀 To scale beyond human limits, we must rethink what rewards we trust. Read the full paper to see our theoretical framework and extensive veRL experiments! #AI #THUNLP #OpenBMB #LLM #ReinforcementLearning #RLVR #MachineLearning

English

1

15

1.1K

OpenBMB@OpenBMB·10 Mar

Synchronous RL training (like #GRPO) suffers from severe inefficiencies: the entire batch must wait for the single longest response to finish, leaving GPUs idle. But naive asynchronous methods often crash training stability. ⏳ Today, we present CoPRIS—new research from @TsinghuaNLP (#OpenBMB member) and @Tsinghua_Uni: A novel framework that achieves efficient and stable RL training by combining concurrency control with importance sampling. 🤗 Paper: huggingface.co/papers/2511.05… 📄 arXiv: arxiv.org/abs/2511.05589 💻 Code: github.com/777pomingzi/Co… Why it matters: 1️⃣ Eliminating Computation Bubbles: Standard RL training is bottlenecked by "long-tail" responses. CoPRIS introduces Concurrency-Controlled Partial Rollout, maintaining a fixed number of active rollouts and reusing unfinished trajectories in subsequent steps. This maximizes GPU utilization without memory overflow. ⚡️ 2️⃣ Solving the Off-Policy Gap: Asynchronous updates usually introduce distribution mismatch (off-policy data). We propose Cross-stage Importance Sampling Correction. By caching and concatenating log probabilities from previous policies, CoPRIS corrects the distribution bias, ensuring training stability comparable to synchronous methods. ⚖️ 3️⃣ Significant Speedup without Loss: CoPRIS isn't just faster; it's safe. It achieves 1.58x - 1.94x end-to-end speedupon math benchmarks (AIME, MinervaMath) compared to state-of-the-art frameworks (veRL). Crucially, the speedup scales linearly with context length, reaching 2.26x at 40k tokens! 🚀 CoPRIS offers a practical, scalable path for efficient LLM post-training, unlocking the potential of RL for long-context reasoning tasks. #AI #THUNLP #OpenBMB #LLM #ReinforcementLearning #Efficiency

English

2

12

101

7.3K

OpenBMB@OpenBMB·9 Mar

Nice work, @Prince_Canuma! We would recommend MiniCPM o4.5 to be supported in the next release. 😉 🙌huggingface.co/openbmb/MiniCP…

Prince Canuma@Prince_Canuma

mlx-vlm v0.4.0 is here 🚀 New models: • Moondream3 by @vikhyatk • Phi-4-reasoning-vision by @MSFTResearch • Phi4-multimodal-instruct by @MSFTResearch • Minicpm-o-2.5 (except tts) by @OpenBMB What's new: → Full weight finetuning + ORPO h/t @ActuallyIsaak → Tool calling in server → Thinking budget support → KV cache quantization for server → Fused SDPA attention optimization → Streaming & OpenAI-compatible endpoint improvements Fixes: • Gemma3n • Qwen3-VL • Qwen3.5-MoE • Qwen3-Omni h/t @ronaldseoh • Batch inference, and more. Big shoutout to 7 new contributors this release! 🙌 Get started today: > uv pip install -U mlx-vlm Leave us a star ⭐️ github.com/Blaizzy/mlx-vl…

English

0

2

6

1.1K

OpenBMB me-retweet

Adina Yakup@AdinaYakup·5 Mar

Feb 2026 China Open Source Highlights 🔥 huggingface.co/collections/zh… ✨ Text & LLMs - Qwen3.5 / Qwen3-Coder-Next: reasoning & coding models, hugging 21+ items 🫶 - Step-3.5-Flash: Sparse MoE, multi-agent reasoning - Nanbeige4.1-3B / MiniMax-M2.5: lightweight & efficient - GLM-5, LLaDA2.1-mini/flash: ultra-large & high-performing - Ling/Ring 2.5 : 1T LLM from Ant group - JoyAI: new team, new release from e-commerce company JD ✨ Image & Multimodal - LongCat Image Edit-Turbo: image editing - Intern S1 Pro: image <> text - GLM-OCR - FireRed Image Edit-1.0, NextStep-1.1-Pretrain: Text-to-Image - Ovis2.6-30B-A3B: popular MLLM from Alibaba AIDC team ✨ Audio & Speech - ACE Step1.5: Text-to-Audio - SoulX Singer: singing & TTS ✨ Specialized / Quantized / Any-to-Any - HY-1.8B-2Bit: memory-efficient - ARM-Thinker-Data: experimental - MiniCPM-o-4.5 / SALA: versatile ✨ Robotic - RynnBrain: open embodied foundation models The week before Spring Festival is usually seen as a “golden week” for model releases, and February was particularly busy this year. Although there was no new DeepSeek release and some changes around Qwen recently, it’s encouraging to see more companies and communities contributing to open source. Excited for March’s updates!

English

0

10

45

2.6K

OpenBMB@OpenBMB·5 Mar

❌ Stop passive slides and lonely chatboxes. ✨ Traditional #AI learning is broken: it's either a static PDF or a repetitive conversation with a bot. We’re bridging the gap with OpenMAIC, which takes just one click to spawn a dynamic, fully immersive classroom : ✅AI Teachers explaining with whiteboards ✅ Peer Agents discussing, questioning, and learning with you ✅ GenUI: Interactive PPTX, Quizzes, and PBL projects—all #opensource We’re looking for early testers to experience the first truly "live" AI classroom. Want in? 1️⃣ Follow @OpenBMB 2️⃣ Like & RT 3️⃣ Comment "LEARN" We'll DM the first batch of invites. Let's redefine education together. 🛠️📖 #OpenSource #AIEdTech #LLMs #EdTech #FutureOfEducation #AIClassroom #AITutor

English

3

4

11

526

OpenBMB@OpenBMB·3 Mar

😃Thrilled to see @ElonMusk and @Alibaba_Qwen highlighting Intelligence Density. This aligns perfectly with the "Densing Law" we proposed in our Nature cover story—proving that model performance isn't just about size, but how much intelligence we can pack into every parameter. 🙌Huge congrats to the Qwen team on the 3.5 Small Series! Together, we are pushing the boundaries of #edge AI and making high-density intelligence accessible to all. 🚀 #DensingLaw #OpenBMB #LLM #NatureScience Full article here 👉🏻 nature.com/articles/s4225…

Elon Musk@elonmusk

@Alibaba_Qwen Impressive intelligence density

English

0

3

21

1.7K

OpenBMB

Jelajahi