Interconnects

324 posts

Interconnects banner
Interconnects

Interconnects

@interconnectsai

What you need to know about AI research trends, from @natolambert Wednesday mornings weekly, sometimes extra posts.

เข้าร่วม Haziran 2023
2 กำลังติดตาม8.1K ผู้ติดตาม
Interconnects รีทวีตแล้ว
Nathan Lambert
Nathan Lambert@natolambert·
Open models, what comes next Don't rely on open models catching the frontier The change from models to systems (tools/harness) Business models supporting open are far from viable (except nvidia) How to change from a few key weights to a winning ecosystem interconnects.ai/p/the-next-pha…
English
10
8
96
18.3K
Interconnects รีทวีตแล้ว
Nathan Lambert
Nathan Lambert@natolambert·
For people who are just learning about Nemotron with the awesome Nemotron 3 Super drop, recommend you watching this interview I did with @ctnzr -- Nemotron as a project is a LONG time coming. youtube.com/watch?v=Y3Vb6e…
YouTube video
YouTube
English
3
11
105
13.4K
Interconnects
Interconnects@interconnectsai·
Dean Ball on open models and government control Subtle precedents on the future of open models set by the unfolding Anthropic v. Department of War case. interconnects.ai/p/how-anthropi…
English
0
0
13
10.9K
Interconnects
Interconnects@interconnectsai·
Latest open artifacts (#19): @Alibaba_Qwen 3.5, @Zai_org GLM 5, @MiniMax_AI 2.5 — Chinese labs' latest push of the frontier. Featuring breakdown & analysis of: - Alibaba’s Qwen 3.5 (from 0.8B to 397B), Z.ai’s GLM-5 (744B), and @StepFun_ai 's Step-3.5-Flash. - Plus: Introducing our Relative Adoption Metrics (RAM) to track underrated models like GPT-OSS. - And covering new releases from: @MistralAI , @perplexity_ai , @cohere , @TrillionLabs , @OpenBMB , @nanbeige , @TheInclusionAI , @liquidai , @intern_lm , @JD_Corporate , and @meituan By @natolambert and @xeophon
English
2
10
39
8.2K
Interconnects รีทวีตแล้ว
Nathan Lambert
Nathan Lambert@natolambert·
Latest open artifacts (#19): Qwen 3.5, GLM 5, MiniMax 2.5 — Chinese labs' latest push of the frontier. We're starting to roll out more analysis with relative adoption metric (RAM). Winners: GPT OSS, K2 Thinking, OCR models. Losers: DeepSeek v3.2. interconnects.ai/p/latest-open-…
English
0
11
78
7.4K
Interconnects รีทวีตแล้ว
Nathan Lambert
Nathan Lambert@natolambert·
Open models are in a perpetual race to stay relevant at the frontier. While they're doing better than I, and many experts would expect given the cost of models, I don't see evidence that open models are accelerating and surpassing the best closed models. interconnects.ai/p/open-models-…
English
22
8
145
50.5K
Interconnects
Interconnects@interconnectsai·
Open models in perpetual catch-up The open-closed gap, distillation, innovation timescales, how open models win, specialized models, what’s missing, etc. interconnects.ai/p/open-models-…
English
0
1
7
735
Interconnects รีทวีตแล้ว
Nathan Lambert
Nathan Lambert@natolambert·
In a long time testing the new Opus 4.6 and Codex 5.3 models the most striking thing was how model releases are far trickier to read in 2026. I’m in my post-benchmark era. Claude is still king, but codex is closer than ever. interconnects.ai/p/opus-46-vs-c…
English
19
24
263
56.1K
Interconnects รีทวีตแล้ว
Nathan Lambert
Nathan Lambert@natolambert·
Top 100 LLMs by Downloads Since August 2025 Source: @interconnectsai HuggingFace Snapshots Model list on GitHub: Interconnects-AI/tracked-models (~1.5K models) Featuring: @alibaba_qwen: 40, @AIatMeta: 13, @deepseek_ai: 10, @Microsoft: 8, @GoogleAI: 7, @mistralai: 4, @OpenAI: 2, @allen_ai: 2, @vikhyatk: 1, @NVIDIAAI: 1, @huggingface: 1, @Zai_org: 1, @TencentGlobal: 1 1. meta-llama/Llama-3.1-8B-Instruct - 53.3M 2. Qwen/Qwen2.5-7B-Instruct - 52.4M 3. Qwen/Qwen2.5-VL-3B-Instruct - 49.5M 4. Qwen/Qwen2.5-3B-Instruct - 46.3M 5. Qwen/Qwen3-0.6B - 45.6M 6. openai/gpt-oss-20b - 43.1M 7. Qwen/Qwen2.5-1.5B-Instruct - 32.6M 8. meta-llama/Llama-3.2-1B-Instruct - 27.6M 9. Qwen/Qwen3-8B - 24.0M 10. Qwen/Qwen2.5-VL-7B-Instruct - 23.3M 11. openai/gpt-oss-120b - 22.3M 12. google/gemma-3-1b-it - 20.7M 13. Qwen/Qwen3-4B-Instruct-2507 - 19.7M 14. google/t5gemma-b-b-prefixlm - 17.5M 15. Qwen/Qwen3-4B - 17.1M 16. Qwen/Qwen3-32B - 15.5M 17. meta-llama/Llama-3.2-1B - 15.3M 18. Qwen/Qwen2-VL-2B-Instruct - 15.2M 19. Qwen/Qwen3-1.7B - 15.1M 20. deepseek-ai/DeepSeek-OCR - 15.0M 21. deepseek-ai/DeepSeek-R1-Distill-Qwen-32B - 14.5M 22. mistralai/Mistral-7B-Instruct-v0.2 - 13.3M 23. Qwen/Qwen3-Next-80B-A3B-Instruct - 13.0M 24. Qwen/Qwen2.5-0.5B-Instruct - 12.8M 25. meta-llama/Meta-Llama-3-8B - 12.0M 26. Qwen/Qwen2.5-Coder-0.5B-Instruct - 11.7M 27. meta-llama/Llama-3.2-3B-Instruct - 11.6M 28. vikhyatk/moondream2 - 11.2M 29. Qwen/Qwen2.5-14B-Instruct - 10.3M 30. Qwen/Qwen2.5-32B-Instruct - 9.2M 31. Qwen/Qwen3-VL-8B-Instruct - 8.7M 32. Qwen/Qwen2-VL-7B-Instruct - 8.6M 33. Qwen/Qwen2.5-7B - 8.5M 34. microsoft/Phi-3-mini-4k-instruct - 8.0M 35. meta-llama/Meta-Llama-3-8B-Instruct - 7.7M 36. google/gemma-3-27b-it - 7.7M 37. google/gemma-3-12b-it - 7.1M 38. llava-hf/llava-1.5-7b-hf - 7.1M 39. deepseek-ai/DeepSeek-R1-Distill-Llama-8B - 7.0M 40. google/gemma-3-4b-it - 7.0M 41. Qwen/Qwen3-VL-30B-A3B-Instruct - 6.9M 42. Qwen/Qwen3-4B-Base - 6.9M 43. deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B - 6.8M 44. Qwen/Qwen2.5-0.5B - 6.8M 45. meta-llama/Llama-3.1-8B - 6.8M 46. OpenGVLab/InternVL2-2B - 6.7M 47. Qwen/Qwen3-30B-A3B-Instruct-2507 - 6.5M 48. nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1 - 6.2M 49. mistralai/Mistral-7B-Instruct-v0.3 - 6.2M 50. Qwen/Qwen2.5-VL-32B-Instruct - 6.2M 51. deepseek-ai/DeepSeek-R1-Distill-Qwen-7B - 6.0M 52. microsoft/phi-2 - 6.0M 53. Qwen/Qwen3-14B - 5.8M 54. meta-llama/Llama-2-7b-hf - 5.5M 55. Qwen/Qwen2-1.5B-Instruct - 5.5M 56. microsoft/Florence-2-large - 5.3M 57. HuggingFaceTB/SmolLM2-135M - 4.8M 58. microsoft/phi-4 - 4.7M 59. meta-llama/Llama-3.1-70B-Instruct - 4.6M 60. zai-org/chatglm2-6b - 4.2M 61. Qwen/Qwen2.5-Coder-7B-Instruct - 4.2M 62. rednote-hilab/dots.ocr - 4.1M 63. OpenGVLab/InternVL3_5-241B-A28B-Instruct - 4.1M 64. Qwen/Qwen2.5-1.5B - 4.0M 65. OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview-HF - 3.9M 66. meta-llama/Llama-2-7b-chat-hf - 3.9M 67. Qwen/Qwen3-Coder-30B-A3B-Instruct - 3.9M 68. deepseek-ai/DeepSeek-R1 - 3.8M 69. mistralai/Mistral-Small-24B-Instruct-2501 - 3.7M 70. microsoft/Phi-3.5-vision-instruct - 3.7M 71. meta-llama/Llama-3.3-70B-Instruct - 3.7M 72. deepseek-ai/DeepSeek-V3 - 3.6M 73. OpenGVLab/InternVL3-78B - 3.6M 74. deepseek-ai/DeepSeek-R1-0528 - 3.5M 75. OpenGVLab/InternVL3-14B - 3.5M 76. Qwen/Qwen3-30B-A3B - 3.2M 77. Qwen/Qwen3-VL-2B-Instruct - 3.2M 78. meta-llama/Llama-3.2-3B - 3.2M 79. microsoft/Florence-2-base - 3.2M 80. google/paligemma2-3b-pt-224 - 3.2M 81. allenai/OLMo-2-0425-1B - 3.1M 82. Qwen/Qwen3-VL-32B-Instruct - 3.0M 83. tencent/HunyuanOCR - 3.0M 84. OpenGVLab/InternVL2-1B - 2.9M 85. Qwen/Qwen3-8B-Base - 2.8M 86. Qwen/Qwen2.5-VL-72B-Instruct - 2.8M 87. google/gemma-2-2b-it - 2.7M 88. llava-hf/llava-v1.6-mistral-7b-hf - 2.7M 89. microsoft/Phi-4-multimodal-instruct - 2.7M 90. mistralai/Mixtral-8x7B-Instruct-v0.1 - 2.7M 91. Qwen/Qwen3-VL-4B-Instruct - 2.7M 92. Qwen/Qwen2.5-Coder-1.5B - 2.7M 93. meta-llama/Llama-3.2-11B-Vision-Instruct - 2.6M 94. Qwen/Qwen2-0.5B - 2.6M 95. Qwen/Qwen3-0.6B-Base - 2.5M 96. Qwen/Qwen3-4B-Thinking-2507 - 2.5M 97. deepseek-ai/DeepSeek-R1-Distill-Llama-70B - 2.5M 98. deepseek-ai/deepseek-coder-1.3b-instruct - 2.4M 99. microsoft/Phi-3-mini-128k-instruct - 2.4M 100. allenai/olmOCR-2-7B-1025-FP8 - 2.3M
English
13
9
105
76.7K
Interconnects
Interconnects@interconnectsai·
Why did @NVIDIA build Megatron? 🤖⚡ @ctnzr breaks down the origin story of the project that proved state-of-the-art Transformers could be built on NVIDIA hardware. The name? Let’s just say they wanted the "biggest and baddest" Transformer out there. 🦾
English
0
1
16
1.4K