OpenMMLab

701 posts

OpenMMLab

@OpenMMLab

From MMDetection to AI Exploration. Empowering AI research and development with OpenMMLab. Discord：https://t.co/BWaz5KtF5e

Joined Haziran 2020

131 Following6.2K Followers

OpenMMLab retweeted

Intern Large Models@intern_lm·1d

🔥Introducing ARM-Thinker, the first Agentic multimodal Reward Model that autonomously invokes external tools to ground judgments in verifiable evidence. Accepted to CVPR 2026! 🥳Integrates three families of multimodal tools: 1⃣Image Crop & Zoom-in for fine-grained visual inspection. 2⃣Document Retrieval for multi-page evidence gathering. 3⃣Instruction-Following Validators for constraint verification. 🥳With a Think-Act-Verify loop, ARM-Thinker can call image crop & zoom-in, document retrieval, and instruction-following validators for evidence-based evaluation. 🥳Built on Qwen2.5-VL-7B with SFT + two-stage GRPO, ARM-Thinker improves multimodal reward modeling, tool-use reasoning, and multimodal math/logical reasoning. 😉+16.2% on reward modeling benchmarks (outperforming GPT-4o). 😉+9.6% on tool-use / think-with-images tasks (matching Mini-o3). 😉+4.2% on multimodal math & logical reasoning. 🥳Also introduce ARMBench-VL, the first multimodal reward benchmark that requires tool use. 📄 Paper: arxiv.org/abs/2512.05111 💻 Code: github.com/InternLM/ARM-T… 🤗 Dataset: @huggingface huggingface.co/datasets/inter… 🧪 Evaluation: github.com/open-compass/V…

English

3.8K

OpenMMLab retweeted

Intern Large Models@intern_lm·13 Mar

🚀Meet InternVL-U: a lightweight 4B unified multimodal model that brings reasoning, generation, and editing into a unified framework. 🔥Built upon unified contextual modeling, modality-specific modular design, and decoupled visual representations, InternVL-U achieves a strong performance-efficiency trade-off, consistently outperforming unified baselines with over 3× larger model scales on challenging tasks such as text rendering, scientific reasoning, and spatially grounded generation and editing. 😉Open-source and designed for efficient, practical multimodal intelligence. 🤗GitHub: github.com/OpenGVLab/Inte… 🤗Hugging Face: @huggingface huggingface.co/InternVL-U/Int… 🤗GenEditEvalKit: github.com/open-compass/G… 🤗TextEdit: github.com/open-compass/T… 🤗Tech report: arxiv.org/pdf/2603.09877

English

151

19.9K

OpenMMLab retweeted

Intern Large Models@intern_lm·4 Şub

🚀Introducing Intern-S1-Pro, an advanced 1T MoE open-source multimodal scientific reasoning model. 1⃣SOTA scientific reasoning, competitive with leading closed-source models across AI4Science tasks. 2⃣Top-tier performance on advanced reasoning benchmarks, strong general multimodal performance on various benchmarks. 3⃣1T-A22B MoE training efficiency with STE routing (dense gradient for router training) and grouped routing for stable convergence and balanced expert parallelism. 4⃣Fourier Position Encoding (FoPE) + upgraded time-series modeling for better physical signal representation; supports long, heterogeneous time-series (10^0–10^6 points). 😍Intern-S1-Pro is now supported by vLLM @vllm_project and SGLang @sgl_project @lmsysorg — more ecosystem integrations are on the way. ☺️Model：@huggingface huggingface.co/internlm/Inter… ☺️GitHub: github.com/InternLM/Inter… ☺️Try it now at: chat.intern-ai.org.cn

English

143

948

297.7K

OpenMMLab retweeted

Intern Large Models@intern_lm·25 Kas

🚀 Introducing Spatial-SSRL, the first study which proposes a Self-Supervised Reinforcement Learning paradigm for spatial understanding. 💡 Spatial-SSRL a lightweight tool-free framework that aims at enhancing spatial intelligence and is natually compatible with the RLVR training paradigm. Only raw 2D and RGB-D images are required and we avoid any use of human annotation, external proprietary model or expert model throughout the entire pipeline, making Spatial-SSRL highly cost-effective and scalable. 🛰️ Spatial-SSRL comprises five pretext tasks now: shuffled patch reordering, flipped patch recognition, cropped patch inpainting, regional depth ordering, and 3D relative position prediction. Thanks to its lightweight characteristics, Spatial-SSRL can be easily extended to more pretext tasks and we welcome the whole community to join Spatial-SSRL with effective pretext tasks! 🤖 After applying Spatial-SSRL, we significantly enhance the performance of spatial understanding on Qwen2.5-VL (3B&7B) and Qwen3-VL (4B), as well as retaining their general visual capabilities. 🤗 Currently, we have released the repository of Spatial-SSRL, the dataset Spatial-SSRL-81k, and the trained models: Spatial-SSRL-7B and Spatial-SSRL-Qwen3VL-4B. The total download of the models and dataset has surpassed 1,000. 👇 Try Spatial-SSRL-7B now at: huggingface.co/spaces/yuhangz… Paper: arxiv.org/abs/2510.27606 Github: github.com/InternLM/Spati… Model (on Qwen2.5-VL): huggingface.co/internlm/Spati… Model (on Qwen3-VL): huggingface.co/internlm/Spati… Dataset: huggingface.co/datasets/inter…

English

3.3K

OpenMMLab retweeted

OpenCompass@OpenCompassX·18 Kas

🥰OutSafe-Bench, the first most comprehensive content safety evaluation test suite designed for the multimodal era. 😍Covers 4 modalities: 18,000+ bilingual (ZH/EN) text prompts 4,500 images 450 audio clips 450 videos 👏OutSafe-Bench is now part of the Daily Benchmark. 👇Explore more： hub.opencompass.org.cn/daily-benchmar…

English

453

OpenMMLab retweeted

OpenCompass@OpenCompassX·17 Kas

🚀 OpenCompass Daily Benchmark is live! ✅ Daily updates of the latest AI evaluation papers ✅ AI-powered smart summaries ✅ Available in English & Chinese 😍Stay ahead of AI trends, key insights, and cutting-edge research—all in one place! 🔗 hub.opencompass.org.cn/daily-benchmar…

English

420

OpenMMLab retweeted

Intern Large Models@intern_lm·29 Eki

🚀Introducing #CapRL, the first study of applying GRPO for the open-ended and subjective image captioning task. 🤯 🤖The trained CapRL-3B model achieves image captioning performance comparable to Qwen2.5-VL-72B. ✨CapRL introduces a novel training framework that redefines caption quality through its utility: a high-quality caption should enable a non-visual language model to accurately answer questions about the corresponding image. 📈Currently, CapRL is open-sourced, with total downloads of the models and datasets surpassing 7,000. The research team is continuously iterating with stronger base models and improved training recipe. 👇 Try it now at: huggingface.co/spaces/yuhangz… Paper: arxiv.org/abs/2509.22647 GitHub: github.com/InternLM/CapRL Model: huggingface.co/internlm/CapRL… Dataset: huggingface.co/datasets/inter…

English

5.8K

OpenMMLab retweeted

Intern Large Models@intern_lm·26 Eyl

🚀 Big news for #lmdeploy v0.10.1! 🥳Our #FP8 high-performance inference is no longer limited to the latest #GPUs. It now supports all #NVIDIA architectures from V100 onwards, bringing major speedups to more users. 🤗github.com/InternLM/lmdep…

English

2.1K

OpenMMLab retweeted

Intern Large Models@intern_lm·10 Eyl

🔥LMDeploy v0.10.0 released! 😊Supercharges OpenAI’s GPT-OSS MXFP4 models. 😊Delivers exceptional performance for GPT-OSS models on V100 and higher GPUs. 😊On H800 & A100, LMDeploy outperforms vLLM across all scenarios—faster, more efficient inference! 🤗github.com/InternLM/lmdep…

English

1.7K

OpenMMLab@OpenMMLab·21 Ağu

🔥🔥🔥@intern_lm @OpenBMB @Zai_org @StepFun_ai @AI_AlibabaInt 🔥🔥🔥

OpenMMLab@OpenMMLab

🔥China’s Open-source VLMs boom—Intern-S1, MiniCPM-V-4, GLM-4.5V, Step3, OVIS 🧐Join the AI Insight Talk with @huggingface, @OpenCompassX, @ModelScope2022 and @ZhihuFrontier 🚀Tech deep-dives & breakthroughs 🚀Roundtable debates ⏰Aug 21, 5 AM PDT 📺Live: youtube.com/live/kh0WSMoVZ…

QME

OpenMMLab@OpenMMLab·21 Ağu

YouTube

English

4.3K

OpenMMLab retweeted

OpenCompass@OpenCompassX·7 Ağu

🚀 Introducing #CompassVerifier: A unified and robust answer verifier for #LLMs evaluation and #RLVR! ✨LLM progress is bottlenecked by weak evaluation, looking for an alternative to rule-based verifiers? CompassVerifier can handle multiple domains including math, science, and reasoning, as well as various answer types such as multiple-choice questions, sequences, and multiple subproblems. 🔍 Introducing #VerifierBench: A challenging dataset for evaluating the verification capabilities of different models! 🏆 Want to evaluate the verification abilities? We collected over 1 million LLMs responses using #OpenCompass framework, and selected the most challenging examples through multiple rounds of screening and manual annotation. 🏠 Main Page： open-compass.github.io/CompassVerifier 📄 Paper: arxiv.org/pdf/2508.03686 💻 Code: github.com/open-compass/C… 🤗 Model & Datasets:@huggingface huggingface.co/collections/op…

English

889

OpenMMLab retweeted

Intern Large Models@intern_lm·30 Tem

Our paper won an outstanding paper on ACL 2025. Try our best open-source multimodal reasoning model Intern-S1 at huggingface.co/internlm/Inter…. This 241B MoE model combines strong general-task capabilities with state-of-the-art performance on a wide range of scientific tasks, rivaling leading closed-source commercial models.

English

6.1K

OpenMMLab retweeted

Intern Large Models@intern_lm·26 Tem

🚀Introducing Intern-S1, our most advanced open-source multimodal reasoning model yet! 🥳Strong general-task capabilities + SOTA performance on scientific tasks, rivaling leading closed-source commercial models. 🥰Built upon a 235B MoE language model and a 6B Vision encoder. 🥰Pretrained on 5T tokens (50%+ scientific data). 🥳Dynamic tokenizer enables native understanding of molecular formulas, protein sequences, and seismic signals. 🤗Model：@huggingface huggingface.co/internlm/Inter… 🤗GitHub: github.com/InternLM/Inter… Try it now at: 🤗chat.intern-ai.org.cn

English

107

615

86.9K

OpenMMLab retweeted

Intern Large Models@intern_lm·8 Tem

🚀 Introducing #POLAR: Bring Reward Model into a New Pre-training Era! ✨ Say goodbye to reward models with poor generalization! POLAR (Policy Discriminative Learning) is a groundbreaking pre-training paradigm that trains reward models to distinguish policy distributions, effortlessly scalable and eliminating heavy reliance on human preference data! 🏆 Tailored for Reinforcement Fine-tuning (#RFT)! POLAR assigns rewards based on ground truths, seamlessly integrating into the RFT framework and achieving SOTA RL performances across general tasks! 🤗Paper: arxiv.org/pdf/2507.05197 😃Models: huggingface.co/internlm/POLAR… 😉Codes: github.com/InternLM/POLAR

English

112

7.8K

OpenMMLab retweeted

Tiezhen WANG@Xianbao_QIAN·13 Haz

We invited 3 top HF daily papers authors to deliver talks. Topics of this session: Reinforcement Learning Speakers: - Qi-Chen Zhao — Absolute Zero Reasoner: self-play RL that reaches SOTA reasoning with zero external data - Shu-Huai Ren — MiMo-VL: Xiaomi’s unified and lightweight vision-language model - Yu-Zhe Gu — OREAL: an outcome-reward RL method for mid-size LMs to pass complex math gates Time: 6.14 10:00 - 12:00 (China time) x.com/i/broadcasts/1…

English

14.7K

OpenMMLab retweeted

Intern Large Models@intern_lm·23 May

🥳Trained through #InternBootcamp, #InternThinker now combines pro-level Go skills with transparent reasoning. 😉In each game, it acts as a patient, insightful coach—analyzing the board, comparing moves, and clearly explaining each decision. 🤗Try it now: chat.intern-ai.org.cn/internthinker/…

Intern Large Models@intern_lm

🥳Introducing #InternBootcamp, an easy-to-use and extensible library for training large reasoning models. Unlimited automatic question generation and result verification. Over 1,000 verifiable tasks covering logic, puzzles, algorithms, games, and more. 🤗github.com/InternLM/Inter…

English

2.5K

OpenMMLab retweeted

Intern Large Models@intern_lm·23 May

English

4.4K

OpenMMLab@OpenMMLab·21 Mar

🥳#FaceShot generates animations for your "imaginary friends", like Teddy Bear, and brings them into life! 😉Project page: faceshot2024.github.io/faceshot/ 😉Paper link: arxiv.org/abs/2503.00740 😉Code: github.com/open-mmlab/Fac…

English

982

Discover

@huggingface @vllm_project @sgl_project @lmsysorg @intern_lm @OpenBMB @Zai_org @StepFun_ai