刘聪NLP
60 posts

刘聪NLP retweetledi

12月开源模型汇总,2026最期待哪个模型更新
又到了一个月的最后一天,已经汇总,涉及DeepSeek、智谱了、小米、MiniMax、美团、阶跃、Qwen、通义、腾讯、阶跃等27个模型。
mp.weixin.qq.com/s/4Vzbk2u4jowc…

中文

10月开源模型汇总,但Qwen仍在发力,Qwen3-VL尺寸全开;混元稳居世界模型第一;蚂蚁连发Ling、Ring、Ming三款模型;美团转向多模态,进军Video;OCR成最大爆点,DeepSeek高立意、Paddle强落地;MiniMax、快手等也持续上新。
mp.weixin.qq.com/s/CtNq_ZLj_JE5…
中文

整个9月,大模型开源社区依旧很卷,阿里开源Qwen3-Omni、Qwen3-Next、Qwen3-VL等模型;腾讯开源7个模型,二位现在在开源社区都是量产,哈哈哈~
当然,还有美团LongCat-Thinking、快手Keye-VL1.5、面壁VoxCPM等等等等!
最后两天DeepSeek-V3.2、GLM4.6也都出了,
zhuanlan.zhihu.com/p/195640481989…
中文

整体感受,
- 带/think深度思考,带/no_think直接回答,什么都不带是auto模式,自己判断
- Keye-VL1.5对于短视频的理解很不错,一些玩梗的视频可以理解
- OCR和图片理解也不错
- Grounding做了专门的优化,可以精准定位
- 因为模型只有8B大小,对于世界知识、空间逻辑还有空间变换还是存在一定的欠缺
刘聪NLP@logcong0120
快手开源Keye-VL1.5 模型结构还是经典的三件套,视觉编码器(ViT)、MLP映射层,大模型解码器(LLM)。 对于视频,创新地提出了Slow-Fast 视频编码,就是把视频里关键动作用高清慢镜头细看,静止背景用流畅快镜头扫过,既省算力又不丢细节。 mp.weixin.qq.com/s/e3262cNNJPv4…
中文

快手开源Keye-VL1.5
模型结构还是经典的三件套,视觉编码器(ViT)、MLP映射层,大模型解码器(LLM)。
对于视频,创新地提出了Slow-Fast 视频编码,就是把视频里关键动作用高清慢镜头细看,静止背景用流畅快镜头扫过,既省算力又不丢细节。
mp.weixin.qq.com/s/e3262cNNJPv4…
中文
刘聪NLP retweetledi

📅 China's Open-Source LLM Boom in August — A detailed recap by Zhihu mind explorer @logcong0120
🔎 TL;DR:
The open-source race in China is still intense — more players, more models, more action. Did you miss? 👇
• Aug 1 · XBai-o4 (32B) by @theMetaStoneAI: Based on Qwen3-32B, excels in complex reasoning, beats OpenAI-o3-mini.
• Aug 4 · @TencentHunyuan released 4 small models (0.5B–7B) as Qwen3 competitors.
• Aug 4 · @Alibaba_Qwen Qwen-Image: Text-to-image model with fine-grained layout + paragraph rendering.
• Aug 4 · @Xiaomi MiDashengLM-7B: Audio LLM that outperforms Qwen2.5 & Kimi in audio understanding.
• Aug 6 · @xiaohongshu dots.vlm1, combining NaViT visual encoder + DeepSeek V3 LLM.
• Aug 7 · Qwen3-4B-Instruct & -Thinking (Dense models)
• Aug 8 · @OpenBMB MiniCPM-V-4 (4B): Real-time video/image understanding on phones & PCs.
• Aug 11 · Baichuan-M2-32B (Medical LLM) & GLM4.5-V @Zai_org , 106B MoE with "thinking mode")
• Aug 12 · Lumina-mGPT 2.0 (Shanghai AI Lab): Decoder-only model for unified vision tasks.
• Aug 12 · Kuaishou Klear-Reasoner-8B
• Aug 13 · StepFun-Prover-32B (theorem-proving)
• Aug 14 · @TencentHunyuan Hunyuan-GameCraft: Interactive game video generation from image + text + actions.
• Aug 18 · @StepFun_ai NextStep-1: Includes a 14B LLM + image generation/editing model.
• Aug 19 · @Alibaba_Qwen Qwen-Image-Edit (20B): Brings precision text rendering into image editing.
• Aug 20 · @deepseek_ai DeepSeek-V3.1: Improved coding, slightly weaker on general text.
• Aug 21 · ByteDance Seed-OSS (36B)
• Aug 23 · @intern_lm Intern-S1-mini (8B): Strong for scientific tasks.
• Aug 26
· @OpenBMB MiniCPM-V 4.5 (8B): High-frame-rate video understanding
· @intern_lm InternVL 3.5 series: 9 models, Dense + MoE
• Aug 26 · @Alibaba_Wan Wan2.2-S2V-14B: Text + image + audio → lifelike digital human video.
• Aug 28
· HunyuanVideo-Foley: Auto sound effects for video
· @BytedanceTalk USO: Style + subject controllable image generation
• Aug 31 ·@Meituan_LongCat (560B MoE): Dynamic routing activates 18.6B–31.3B parameters per query.
💡 Missed any? Catch the full recap on Zhihu (CN): zhuanlan.zhihu.com/p/194578262172…
#ChinaAI #LLM #OpenSource #Multimodal #AI

English
刘聪NLP retweetledi

Intern-S1-mini 🔥 lightweight open multimodal reasoning model by @intern_lm
huggingface.co/internlm/Inter…
✨ Efficient 8B LLM + 0.3B vision encoder
✨ Apache 2.0
✨ 5T multimodal pretraining, 50%+ in scientific domains
✨ Dynamic tokenizer for molecules & protein sequences
English

有中感觉,国内大模型,现在不开源,会有罪。
字节也开源了Seed-OSS模型,36B,甜点尺寸,还有剔除融合数据的预训练模型
mp.weixin.qq.com/s/I823_ajeTG_s…
中文
刘聪NLP retweetledi

🚀 China's Open-Source AI Models Are Booming!
In July alone, 🇨🇳 flooded @huggingface Trending with 9/10 top models - open-sourced by Chinese teams.
Here's your lightning-fast recap compiled by Zhihu Mind Explorer @logcong0120
Now let’s break down the July drop:
📅 June 27 – @TencentHunyuan releases Hunyuan A13B: 80B total, 13B active params. Fills the 70-80B size gap.
📅 June 30 – @Baidu_Inc open-sources ERNIE 4.5: full-size LLMs + multimodal versions.
📅 July 1 – @Alibaba_Qwen drops ThinkSound, the first CoT audio model for frame-level video dubbing.
📅 July 2 – @Zai_org releases GLM-4.1V-Thinking (9B), a powerful vision-language model.
📅 July 4 – 昆仑万维 launches Skywork-Reward-V2, 8 different reward models (600M-80B).
📅 July 8 – @AntGroup open-sources KAG-Thinker, a deep reasoning model for multi-hop cognitive tasks.
📅 July 9 – 昆仑万维 again, with Skywork-R1V3, a multimodal model fine-tuned from InternVL-38B.
📅 July 11 – @Kimi_Moonshot Kimi-K2 gets 12K+ downloads in 20 minutes, with Base + Instruct models.
📅 July 12 – Zhihu unveils Zhi-Create, a creative writing model fine-tuned on Qwen3-32B.
📅 July 19 – @BytedanceTalk drops Seed-X, a multilingual translation series (7B) covering Instruct, RM, PPO.
📅 July 21-25 – @Alibaba_Qwen open-sources three giants: Qwen3-235B-A22B-Instruct, Qwen3-Coder-480B-A35B-Instruct and Qwen3-235B-A22B-Thinking.
📅 July 26 – Shanghai AI Lab unveils Intern-S1, a massive 241B multimodal reasoning model.
📅 July 27 – @TencentHunyuan drops HunyuanWorld-1, the first open 3D immersive, interactive, and simulated world generation model. Game dev, VR, content creation = transformed.
📅 July 28 – @Alibaba_Wan goes big with Wan2.2, the first MoE-based video generation foundation model. Includes: T2V (text-to-video), I2V (image-to-video), TI2V (unified video generation).
📅 July 28 – @Zai_org releases GLM-4.5 series: GLM-4.5 355B-A32B and GLM-4.5-Air 106B-A12B, shot straight to the top of HuggingFace.
📅 July 30 – @Alibaba_Qwen adds two “friendly size” releases: Qwen3-30B-A3B-Instruct and Qwen3-30B-A3B-Thinking
📅 July 30 – 昆仑万维 launches Skywork-UniPic-1.5B, a unified multimodal model for image understanding, generation, and editing.
📅 July 31 – @StepFun_ai open-sources Step 3, setting new benchmarks in multimodal reasoning efficiency.
🤔 Open-sourcing isn't just technical - it’s strategic, and China's making moves at a blistering pace.
📖 Full article on Zhihu: zhuanlan.zhihu.com/p/193419697965…
#OpenSource #LLM #AI #ChinaAI


English


















