WeiboLLM

16 posts

WeiboLLM

@WeiboLLM

Building the Future with AI

Singapore เข้าร่วม Eylül 2025

12 กำลังติดตาม492 ผู้ติดตาม

WeiboLLM@WeiboLLM·20 Kas

@FGuzmanAI And with only ~1.5GB of RAM for such high-performance reasoning—simply incredible!

English

WeiboLLM@WeiboLLM·20 Kas

@FGuzmanAI Thrilled to see the model running on iPhone with 4-bit quantization and MLX! The community has been waiting for this. Fantastic work! 🔥

English

420

Fabio Guzman@FGuzmanAI·19 Kas

Welcome VibeThinker-1.5B to MLX! 🚀 This 1.5B model is competitive with GPT-OSS-20B and MiniMax 456B on AIME2025! 🤯 ONLY 1.54GB MEMORY FOOTPRINT! ⚡️ Run it locally on your Mac now: 🤗 Model: huggingface.co/mlx-community/… 💻 Github: github.com/WeiboAI/VibeTh… #MLX #AppleSilicon

English

279

16.8K

WeiboLLM@WeiboLLM·19 Kas

@MaziyarPanahi Thanks for the shout-out! It's great to see your work on quantizing and running VibeThinker-1.5B so smoothly on device. Massive respect!

English

630

Maziyar PANAHI@MaziyarPanahi·18 Kas

it's crazy what a 1.5B model can do these days! "VibeThinker-1.5B is a 1.5-billion parameter dense language model. With a total training cost of only $7,800 USD, it achieves reasoning performance comparable to larger models like GPT OSS-20B Medium." runs perfectly on device!

English

902

202.2K

WeiboLLM@WeiboLLM·19 Kas

VibeThinker-1.5B hit #1 on @huggingface ’s trending models today! 🔥 Huge thank you to our amazing community for the love, downloads, and priceless feedback.❤️

English

769

WeiboLLM@WeiboLLM·19 Kas

@hrdkbhatnagar @ahochlehnert @vishaal_urao @ameyaprabhu Thank you! Independent evaluations like this are really valuable and help make open-source better for everyone. We truly appreciate your support!

English

128

Hardik Bhatnagar@hrdkbhatnagar·18 Kas

🚨 Breaking @WeiboLLM's VibeThinker 1.5B leads the Sober Reasoning leaderboard for its size Punching way above its weight -- outperforming even 32B models 🔥 Outstanding work, @WeiboLLM team!

English

2.1K

WeiboLLM@WeiboLLM·19 Kas

@ahochlehnert Thank you so much! Independent evaluations like this make the open-source world better for everyone. Grateful for the love and support!

English

Andreas Hochlehnert@ahochlehnert·18 Kas

Using our Sober Reasoning pipeline, we evaluated @WeiboLLM's VibeThinker 1.5B and were able to verify its outstanding performance! Really impressive work 💪

Hardik Bhatnagar@hrdkbhatnagar

🚨 Breaking @WeiboLLM's VibeThinker 1.5B leads the Sober Reasoning leaderboard for its size Punching way above its weight -- outperforming even 32B models 🔥 Outstanding work, @WeiboLLM team!

English

321

WeiboLLM@WeiboLLM·18 Kas

@MaziyarPanahi @lmstudio Thanks for the support and all the recommendations! Glad we could help. We’ll keep improving and would love to hear your thoughts anytime — together we go further!

English

324

Maziyar PANAHI@MaziyarPanahi·18 Kas

thanks to @WeiboLLM, you can have a model for competitive-style math and algorithm coding problems right in your @lmstudio huggingface.co/MaziyarPanahi/…

English

5.6K

WeiboLLM@WeiboLLM·11 Kas

I strongly agree with your perspective. We recently open-sourced a 1.5B small model, which performs well on competition-level math and code problems. On benchmarks like AIME and HMMT, it even surpasses deepseekr1-0120, and its cost is less than $8,000. We are looking forward to your thoughts on this model. x.com/WeiboLLM/statu…

English

clem 🤗@ClementDelangue·14 Eki

Am I wrong in sensing a paradigm shift in AI? Feels like we’re moving from a world obsessed with generalist LLM APIs to one where more and more companies are training, optimizing, and running their own models built on open source (especially smaller, specialized ones) Some validating signs just in the past few weeks: - @karpathy released nanochat to train models in just a few lines of code - @thinkymachines launched a fine-tuning product - rising popularity of @vllm_project, @sgl_project, @PrimeIntellect, Loras, trl,... - 1M new repos on HF in the past 90 days (including the first open-source LLMs from @OpenAI) And now, @nvidia just announced DGX Spark, powerful enough for everyone to fine-tune their own models at home. Would you agree, or am I just seeing the future I want to exist? Also, why is this happening (just the advent of RL/post-training?)

English

146

217

384.6K

WeiboLLM@WeiboLLM·11 Kas

@gm8xx8 Curious to get your thoughts on our new 1.5B model, VibeThinker. We're seeing it challenge scaling laws: it outperforms a 671B model on AIME math, and was trained for only $7.8k using our "Spectrum-to-Signal Principle." It's open-source, details below: x.com/WeiboLLM/statu…

English

130

𝚐𝔪𝟾𝚡𝚡𝟾@gm8xx8·6 Kas

Diffusion Language Models are Super Data Learners… now on arXiv with MegaDLMs, the full large-scale training framework (6.1K H100s, 462B-param run, 47 % MFU). Supports diffusion and autoregressive LMs, dense and MoE architectures, FP8/BF16/FP16 precision, and multi-axis parallelism (TP, PP, EP, CP) built on Megatron and Transformer Engine.

English

248

15.3K

WeiboLLM@WeiboLLM·11 Kas

@rasbt @rasbt Agree, Kimi K2 thinking is a massive leap for open weights! 🔥 But what if I told you a 1.5B model can beat a 671B giant on Olympiad-level problems like AIME and HMMT? We just open-sourced VibeThinker-1.5B, details below: x.com/WeiboLLM/statu…

WeiboLLM@WeiboLLM

⭐ VibeThinker-1.5B — SOTA reasoning in a tiny model. 🚀 Performance: Highly competitive on AIME24/25 & HMMT25 — surpasses DeepSeek R1-0120 on math, and outperforms same-size models in competitive coding. ⚡ Efficiency: Only 1.5B params — 100-600× smaller than giants like Kimi K2 & DeepSeek R1. 💰 Cost: Full post-training for just $7.8K — 30-60× cheaper than DeepSeek R1 or MiniMax-M1. 🧠 Innovation: Powered by our Spectrum-to-Signal Principle (SSP) and MGPO framework. Model : huggingface.co/WeiboAI/VibeTh… Github: github.com/WeiboAI/VibeTh… Arxiv : arxiv.org/abs/2511.06221 #AI #LLM #Reasoning #OpenSource #SmallModel

English

Sebastian Raschka@rasbt·6 Kas

Exciting big Kimi K2 Thinking release! More experts, fewer heads, and even more thinking!

Kimi.ai@Kimi_Moonshot

🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns. K2 Thinking is now live on kimi.com in chat mode, with full agentic mode coming soon. It is also accessible via API. 🔌 API is live: platform.moonshot.ai 🔗 Tech blog: moonshotai.github.io/Kimi-K2/thinki… 🔗 Weights & code: huggingface.co/moonshotai

English

170

1.3K

91.2K

WeiboLLM@WeiboLLM·11 Kas

@reach_vb Curious to get your thoughts on our new 1.5B model, VibeThinker. We're seeing it challenge scaling laws: it outperforms a 671B model on AIME math, and was trained for only $7.8k using our "Spectrum-to-Signal Principle." It's open-source, details below: x.com/WeiboLLM/statu…

English

Vaibhav (VB) Srivastav@reach_vb·5 Kas

2026 will be the year when models become leaner and faster (whilst becoming smarter) a big part of human-in-the-loop DX is how fast the model responses are

English

2.5K

WeiboLLM@WeiboLLM·11 Kas

@_akhaliq Curious to get your thoughts on our new 1.5B model, VibeThinker. We're seeing it challenge scaling laws: it outperforms a 671B model on AIME math, and was trained for only $7.8k using our "Spectrum-to-Signal Principle." It's open-source, details below: x.com/WeiboLLM/statu…

English

AK@_akhaliq·11 Kas

Robot Learning from a Physical World Model

English

198

34.3K

WeiboLLM@WeiboLLM·11 Kas

1st, we encourage it to explore manypossible answers (Spectrum Phase). Then, we teach it to identify & amplify the bestones (Signal Phase) This "explore then focus" method is key to its strong reasoning.

English

2.9K

WeiboLLM@WeiboLLM·11 Kas

more benchmark results and comparisons with other models

English

3.6K

WeiboLLM@WeiboLLM·11 Kas

English

383

110.8K

ค้นพบ

@FGuzmanAI @MaziyarPanahi @huggingface @hrdkbhatnagar @ahochlehnert @vishaal_urao @ameyaprabhu @lmstudio