ModelScope

3

Erika S@E_FutureFan·2d

@ModelScope2022 I'm skeptical when small models claim SOTA, but beating Qwen3-VL-235B on OmniDocBench with just 3B? That's efficient scaling. How's it with degraded historical archives?

English

0

1

252

ModelScope@ModelScope2022·2d

🔥 Meet dots.ocr-1.5: 3B OCR model from Rednote-hilab , SOTA multilingual document parsing, virtually any writing system. 📊 Elo 1089 on olmOCR-Bench, 1157 on XDocParse — above GLM-OCR, and PaddleOCR-VL-1.5 📄 OmniDocBench text edit 0.031, beats Qwen3-VL-235B (0.069) and Gemini 2.5 Pro (0.075) 🎨 SVG code output for charts, diagrams, and chemical formulas 🌐 Web parsing, scene text spotting, and object counting included ⚡ vLLM supported, single GPU 🤖 Model: modelscope.cn/models/rednote… 🔗 GitHub: github.com/rednote-hilab/… 🎠 Demo: dotsocr.xiaohongshu.com

English

ModelScope@ModelScope2022

36

244

11.3K

ModelScope@ModelScope2022·4h

@tobeniceman Worth checking out this tweet: x.com/ModelScope2022… dots.mocr looks better than glm-ocr based on the model intro.

dots.mocr from Rednote, a 3B multimodal OCR building on dots.ocr with stronger benchmarks and broader task coverage. 🚀 📊 Tops HuanyuanOCR, GLM-OCR, and PaddleOCR-VL-1.5 across olmOCR-Bench, OmniDocBench v1.5, and XDocParse with Elo average 1124.7 🎨 Charts, UI layouts, scientific figures parsed directly to SVG — dots.mocr-svg variant for dedicated image-to-SVG tasks 🌐 Web parsing, scene text spotting, document QA all included ⚡ Integrated into vLLM from v0.11.0 📄 Apache 2.0. Model: modelscope.cn/models/rednote… Model 🌍: modelscope.ai/organization/r… Paper: modelscope.ai/papers/2603.13… GitHub: github.com/rednote-hilab/…

English

8

kevin@tobeniceman·1d

@ModelScope2022 Not bad. Why didn‘t it compare with glm-ocr?

English

0

46

ModelScope@ModelScope2022·1d

Say hi to Qianfan-OCR: a 4B end-to-end document intelligence model, achieving SOTA among all end-to-end models on OmniDocBench v1.5 and OlmOCR Bench. 🏆 OmniDocBench v1.5: 93.12, beats DeepSeek-OCR-v2, Gemini-3 Pro 🏆 KIE average 87.9, above Gemini-3.1-Pro and Qwen3-VL-235B-A22B 🧠 Layout-as-Thought: reasoning mode via token for complex layout recovery 🌍 192 languages supported ⚡ 1.024 PPS on A100 with W8A8 quantization ✍️ Apache 2.0. vLLM ready. 🤖 Model: modelscope.cn/models/baidu-q… 📄 Paper: modelscope.ai/papers/2603.13…

English

17

92

5.5K

ModelScope@ModelScope2022·4h

dots.mocr from Rednote, a 3B multimodal OCR building on dots.ocr with stronger benchmarks and broader task coverage. 🚀 📊 Tops HuanyuanOCR, GLM-OCR, and PaddleOCR-VL-1.5 across olmOCR-Bench, OmniDocBench v1.5, and XDocParse with Elo average 1124.7 🎨 Charts, UI layouts, scientific figures parsed directly to SVG — dots.mocr-svg variant for dedicated image-to-SVG tasks 🌐 Web parsing, scene text spotting, document QA all included ⚡ Integrated into vLLM from v0.11.0 📄 Apache 2.0. Model: modelscope.cn/models/rednote… Model 🌍: modelscope.ai/organization/r… Paper: modelscope.ai/papers/2603.13… GitHub: github.com/rednote-hilab/…

English

1

6

511

ModelScope@ModelScope2022·17h

ModelScope Civision now supports FireRed-Image-Edit-1.1 🚀 Free image generation and training, ready to use. 👉 Give it a try: modelscope.cn/aigc

English

Track 1 : AI for Production Letter Lora : Qwen Image Edit 2511 A custom trained LoRA that reverses engineers typography from real world photographs. Lora Link : modelscope.ai/models/krznun/… @Ali_TongyiLab @Alibaba_Qwen @ModelScope2022 #HappyQwensDay #QwenImageLora

5

21

2K

ModelScope@ModelScope2022·1d

Cool!

angrizan@chainsmoker89

English

ModelScope@ModelScope2022

1

8

1.4K

ModelScope@ModelScope2022·2d

Step-3.5-Flash-SFT is open: the complete SFT training corpus, tokenizer snapshots, and pre-compiled StepTronOSS shards, all in one release. 📊Dataset: modelscope.cn/datasets/stepf… 🧑‍💻code: github.com/stepfun-ai/Ste… - Multi-turn conversation JSON with loss_mask and optional reasoning_content - Tokenizers for Step-3.5-Flash and Qwen3 included for chat template alignment - Pre-compiled shards: drop in and train, no preprocessing - Reference recipes for both Step-3.5-Flash and Qwen3 variants - Apache-2.0 + CC-BY-NC-2.0 🌉 Weights + training framework + SFT data. The full stack.

Step 3.5 Flash is now open source: model weights and full training framework (SteptronOSS), released together.🚀 196B total, 11B active. SWE-bench Verified 74.4% / Terminal-Bench 2.0 51.0%. - MoE architecture: 288 routed experts + 1 shared, Top-8 activation per token - MTP-3: predicts 4 tokens per forward pass, 100–300 tok/s typical, 350 tok/s peak - 3:1 SWA ratio (1 full attention + 3 sliding window layers): 256K context at lower compute cost - 💻 Runs on Mac Studio M4 Max and NVIDIA DGX Spark - SteptronOSS: SFT, continued pretraining, RL (WIP) - Apache 2.0 Two checkpoints released: Step-3.5-Flash-Base and Step-3.5-Flash-Base-Midtrain. 🤖 Base: modelscope.cn/models/stepfun… 🤖 Midtrain: modelscope.cn/models/stepfun… 🔧 Training Framework: github.com/stepfun-ai/Ste… 📄 Paper: modelscope.cn/papers/2602.10…

English

15

119

11.2K

ModelScope@ModelScope2022·2d

🚀 Skills Central is now live on ModelScope! 🎉 Dive in and explore the amazing Skills built by open community: 🔗 modelscope.cn/skills 🛠️ Comprehensive Coverage: Spanning dev tools, frontend, code quality, multimedia, mobile, and cloud tooling. ⚡ Immediate Integration: One-liner installation to OpenClaw, Cursor, Qoder and more! Or grab the ZIP file with just one click. 🔍 Easy Discovery: Discover the Skills you need in one place, with comprehensive bilingual (English and Chinese) documentation. 🔌 What's Next: OpenAPI access, ModelScope SDK integrations and more features are on the way! Together with our MCP Plaza, we hope the addition of Skills Central will facilitate better interactions between open models and the flourishing Tooling ecosystems. Come build with us!

English

6

34

2.6K

ModelScope me-retweet

Artificial Guy / João Vitor A.@artificialguybr·2d

Trained a LoRA on @Alibaba_Qwen Qwen-Image to generate professional high-quality LOGOS. All business need a logo to start and this LORA came to help about that. modelscope.ai/models/artific… @Ali_TongyiLab @modelscope2022 #HappyQwensday #QwenImageLora

Artificial Guy / João Vitor A. tweet media

English

3

4

757

ModelScope@ModelScope2022·2d

@ajie_run_CL 感谢分享！辛苦将模型上传至modelscope.ai参赛噢！

中文

0

30

阿杰快跑CL@ajie_run_CL·3d

I trained a LoRA model based on Qwen-Edit-2509 and used the LoRA fine-tuning method to deeply adapt it to the context of traditional Chinese murals, with a focus on applications such as digital restoration of murals and cultural heritage preservation. By introducing cutting-edge AI technology into the field of traditional art restoration, the model not only enhances the accuracy of digital restoration for damaged murals but also helps advance the intelligent management and intergenerational preservation of cultural heritage, ensuring that these precious historical treasures are permanently preserved and widely accessible in digital form, Download link and more examples are in the comments section. @Ali_TongyiLab @Alibaba_Qwen @ModelScope2022 #HappyQwensday #QwenImageLoRA

English

0

1

57

ModelScope@ModelScope2022·3d

@ovi054 @Ali_TongyiLab @Alibaba_Qwen such a clear tutorial!

English

0

2

144

Avi Pal@ovi054·3d

Tired of manual color grading? Color Grade Transfer LoRA - My submission for Qwen-Image competition (Track 1). It extracts the exact color grade from a reference image and applies it to the target photo. @Ali_TongyiLab @Alibaba_Qwen @modelscope2022 #HappyQwensday #QwenImageLora

English

1

6

466

ModelScope@ModelScope2022·3d

@webdevdeep @Ali_TongyiLab @Alibaba_Qwen pretty useful

English

2

112

THE DSA Coder🌞@webdevdeep·3d

3D artists spend weeks building exploded product renders in Blender and C4D. Enterprises pay thousands for it. I spent the last 48 hours building the "Omni-Section" LoRA on the Qwen-Image 2.0 architecture. @Ali_TongyiLab @Alibaba_Qwen @modelscope2022 #HappyQwensday #QwenImageLoRA

English

0

4

252

ModelScope@ModelScope2022·3d

Fun-CineForge is here! 🚀 Inference code and checkpoints just dropped. An end-to-end pipeline and multimodal LLM-based dubbing model built for diverse cinematic scenes. 🎬 Zero-shot dubbing across monologue, narration, dialogue, and multi-speaker scenes 🏗️ End-to-end dataset construction pipeline that generates large-scale annotated dubbing datasets from raw video 📦 CineDub-CN: first large-scale Chinese TV drama dubbing dataset with rich annotations and diverse scene types 🌐 English video support added with CineDub-EN samples now available 🏆 Outperforms SOTA on audio quality, lip sync, timbre conversion, and instruction following across all scene types 🔓 Pipeline toolkit, model weights, and inference code fully open Model: modelscope.cn/models/FunAudi… GitHub: github.com/FunAudioLLM/Fu… Demo: funcineforge.github.io

English

3

16

2.2K

ModelScope@ModelScope2022·4d

🎧 Fish Audio S2 Pro is open source: a 4B+400M Dual-AR TTS model with free-form inline prosody and emotion control, trained on 10M+ hours of audio across 80+ languages.💬 🏗️ Dual-AR architecture: 4B Slow AR for semantics + 400M Fast AR for 9 residual codebooks — quality without inference overhead 🎭 Inline control via free-form tags: [whisper], [laughing], [professional broadcast tone] — 15,000+ unique tags, word-level precision 🌐 80+ languages, Tier 1: Japanese, English, Chinese ⚡ SGLang-native: continuous batching, paged KV cache, RadixAttention prefix caching — all inherited from LLM serving stack 📊 RTF: 0.195 on H200, ~100ms time-to-first-audio, 3,000+ acoustic tokens/s 🔓 Weights + fine-tuning code + streaming inference engine all released 🌍 Model: modelscope.ai/models/fishaud… 🤖 Model: modelscope.cn/models/fishaud… 🔧 GitHub: github.com/fishaudio/fish…

English

8

102

5.2K

ezsumm@itzesuma·4d

Flat sports photo → cinematic poster. One prompt. Qwen-Image LoRA, trained free on @modelscope2022 Civision. API-generated training pairs = consistent style. 🔗 modelscope.ai/models/isumenu… 🛠️ github.com/isumenuka/Qwen… @Ali_TongyiLab @Alibaba_Qwen #HappyQwensday #QwenImageLora

English

1

3

812

ModelScope@ModelScope2022·4d

@itzesuma @Ali_TongyiLab @Alibaba_Qwen cool!

English

1

143

ModelScope@ModelScope2022·13 Mar

14B faster than 1.3B. Helios is here 🚀 a 14B real-time long video generation model running at 19.5 FPS on a single H100, with native T2V, I2V, and V2V support. 🌟 The breakthroughs: - No anti-drifting heuristics: no self-forcing, no keyframe sampling — drift simulated during training instead - No standard acceleration: no KV-cache, no sparse/linear attention, no quantization - Compute cost matches 1.3B models via heavy context compression + reduced sampling steps - Four 14B models fit in 80GB during training, no parallelism framework required Outperforms prior methods on both short- and long-video benchmarks. Base + distilled model both released. 🤖 Models: modelscope.cn/collections/Be… 🌍 Models: modelscope.ai/profile/BestWi… 📄 Paper: modelscope.cn/papers/2603.04… 🔧 GitHub: github.com/PKU-YuanGroup/…

English

13

102

8K

ModelScope@ModelScope2022·12 Mar

Meet Twinkle✨, our fully open-source implementation enabling Training via APIs! 🚀 With a clean modular Client-Server paradigm, you can implement your RL training in ~150 lines of code with Twinkle✨. Why you'll love it: 🏡 Multi-tenant: Train multiple LoRAs on ONE shared base model at the same time. ⚡️ Performant: Megatron & Transformers support for fast, stable training. 🛠️ Flexible: Drop in the Tinker API or use native Twinkle✨ APIs for finer-grained control. Built by the team behind ms-swift, with both Client AND Server implementations fully open-source. Run locally, clustered, or try our Serverless service hosted on ModelScope today! 🔗 github.com/modelscope/twi…

English

This time, we're showcasing the Qwen-Image-Edit-2511 model, a fun LoRA model for migrating everything, used in the LoRA training competition. Download link and more examples are in the comments section. @Ali_TongyiLab @Alibaba_Qwen @ModelScope2022 #HappyQwensday #QwenImageLoRA

4

28

2.3K

ModelScope@ModelScope2022·12 Mar

Style transfer with Qwen-Image-Edit-2511 + LoRA 🤩 Feed it any style reference and watch your artwork transform completely, color, mood, and atmosphere all carry over beautifully! Download the LoRA here👉modelscope.ai/models/daniel8…

大雄@dx8152

English