WeiboLLM

16 posts

WeiboLLM

WeiboLLM

@WeiboLLM

Building the Future with AI

Singapore เข้าร่วม Eylül 2025
12 กำลังติดตาม492 ผู้ติดตาม
WeiboLLM
WeiboLLM@WeiboLLM·
@FGuzmanAI And with only ~1.5GB of RAM for such high-performance reasoning—simply incredible!
English
0
0
4
58
WeiboLLM
WeiboLLM@WeiboLLM·
@FGuzmanAI Thrilled to see the model running on iPhone with 4-bit quantization and MLX! The community has been waiting for this. Fantastic work! 🔥
English
1
0
5
420
WeiboLLM
WeiboLLM@WeiboLLM·
@MaziyarPanahi Thanks for the shout-out! It's great​ to see your work on quantizing and running VibeThinker-1.5B so smoothly on device. Massive respect!
English
1
0
8
630
Maziyar PANAHI
Maziyar PANAHI@MaziyarPanahi·
it's crazy what a 1.5B model can do these days! "VibeThinker-1.5B is a 1.5-billion parameter dense language model. With a total training cost of only $7,800 USD, it achieves reasoning performance comparable to larger models like GPT OSS-20B Medium." runs perfectly on device!
English
32
83
902
202.2K
WeiboLLM
WeiboLLM@WeiboLLM·
VibeThinker-1.5B hit #1 on @huggingface ’s trending models today! 🔥 Huge thank you to our amazing community for the love, downloads, and priceless feedback.❤️
WeiboLLM tweet media
English
1
2
18
769
Hardik Bhatnagar
Hardik Bhatnagar@hrdkbhatnagar·
🚨 Breaking @WeiboLLM's VibeThinker 1.5B leads the Sober Reasoning leaderboard for its size Punching way above its weight -- outperforming even 32B models 🔥 Outstanding work, @WeiboLLM team!
Hardik Bhatnagar tweet media
English
1
6
16
2.1K
WeiboLLM
WeiboLLM@WeiboLLM·
@ahochlehnert Thank you so much! Independent evaluations like this make the open-source world better for everyone. Grateful for the love and support!
English
0
0
2
62
WeiboLLM
WeiboLLM@WeiboLLM·
@MaziyarPanahi @lmstudio Thanks for the support and all the recommendations! Glad we could help. We’ll keep improving and would love to hear your thoughts anytime — together we go further!
English
1
0
1
324
WeiboLLM
WeiboLLM@WeiboLLM·
I strongly agree with your perspective. We recently open-sourced a 1.5B small model, which performs well on competition-level math and code problems. On benchmarks like AIME and HMMT, it even surpasses deepseekr1-0120, and its cost is less than $8,000. We are looking forward to your thoughts on this model. x.com/WeiboLLM/statu…
English
0
0
2
39
clem 🤗
clem 🤗@ClementDelangue·
Am I wrong in sensing a paradigm shift in AI? Feels like we’re moving from a world obsessed with generalist LLM APIs to one where more and more companies are training, optimizing, and running their own models built on open source (especially smaller, specialized ones) Some validating signs just in the past few weeks: - @karpathy released nanochat to train models in just a few lines of code - @thinkymachines launched a fine-tuning product - rising popularity of @vllm_project, @sgl_project, @PrimeIntellect, Loras, trl,... - 1M new repos on HF in the past 90 days (including the first open-source LLMs from @OpenAI) And now, @nvidia just announced DGX Spark, powerful enough for everyone to fine-tune their own models at home. Would you agree, or am I just seeing the future I want to exist? Also, why is this happening (just the advent of RL/post-training?)
clem 🤗 tweet media
English
146
217
2K
384.6K
WeiboLLM
WeiboLLM@WeiboLLM·
@gm8xx8 Curious to get your thoughts on our new 1.5B model, VibeThinker. We're seeing it challenge scaling laws: it outperforms a 671B model on AIME math, and was trained for only $7.8k using our "Spectrum-to-Signal Principle." It's open-source, details below: x.com/WeiboLLM/statu…
English
0
0
2
130
𝚐𝔪𝟾𝚡𝚡𝟾
Diffusion Language Models are Super Data Learners… now on arXiv with MegaDLMs, the full large-scale training framework (6.1K H100s, 462B-param run, 47 % MFU). Supports diffusion and autoregressive LMs, dense and MoE architectures, FP8/BF16/FP16 precision, and multi-axis parallelism (TP, PP, EP, CP) built on Megatron and Transformer Engine.
𝚐𝔪𝟾𝚡𝚡𝟾 tweet media
English
2
39
248
15.3K
WeiboLLM
WeiboLLM@WeiboLLM·
@reach_vb Curious to get your thoughts on our new 1.5B model, VibeThinker. We're seeing it challenge scaling laws: it outperforms a 671B model on AIME math, and was trained for only $7.8k using our "Spectrum-to-Signal Principle." It's open-source, details below: x.com/WeiboLLM/statu…
English
0
0
1
88
Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastav@reach_vb·
2026 will be the year when models become leaner and faster (whilst becoming smarter) a big part of human-in-the-loop DX is how fast the model responses are
English
3
1
18
2.5K
WeiboLLM
WeiboLLM@WeiboLLM·
@_akhaliq Curious to get your thoughts on our new 1.5B model, VibeThinker. We're seeing it challenge scaling laws: it outperforms a 671B model on AIME math, and was trained for only $7.8k using our "Spectrum-to-Signal Principle." It's open-source, details below: x.com/WeiboLLM/statu…
English
0
0
1
21
AK
AK@_akhaliq·
Robot Learning from a Physical World Model
English
5
21
198
34.3K
WeiboLLM
WeiboLLM@WeiboLLM·
1st, we encourage it to explore manypossible answers (Spectrum Phase). Then, we teach it to identify & amplify the bestones (Signal Phase) This "explore then focus" method is key to its strong reasoning.
WeiboLLM tweet media
English
2
0
22
2.9K
WeiboLLM
WeiboLLM@WeiboLLM·
more benchmark results and comparisons with other models
WeiboLLM tweet media
English
1
0
21
3.6K
WeiboLLM
WeiboLLM@WeiboLLM·
⭐ VibeThinker-1.5B — SOTA reasoning in a tiny model. 🚀 Performance: Highly competitive on AIME24/25 & HMMT25 — surpasses DeepSeek R1-0120 on math, and outperforms same-size models in competitive coding. ⚡ Efficiency: Only 1.5B params — 100-600× smaller than giants like Kimi K2 & DeepSeek R1. 💰 Cost: Full post-training for just $7.8K — 30-60× cheaper than DeepSeek R1 or MiniMax-M1. 🧠 Innovation: Powered by our Spectrum-to-Signal Principle (SSP) and MGPO framework. Model : huggingface.co/WeiboAI/VibeTh… Github: github.com/WeiboAI/VibeTh… Arxiv : arxiv.org/abs/2511.06221 #AI #LLM #Reasoning #OpenSource #SmallModel
WeiboLLM tweet media
English
28
58
383
110.8K