Mia

326 posts

Mia

@MiaAI_lab

Local AI, LLMs, tech thinker & builder

Sumali Temmuz 2022

191 Sinusundan194 Mga Tagasunod

Naka-pin na Tweet

Mia@MiaAI_lab·4d

Run DeepSeek v4 Flash locally on your 2x DGX Sparks easily, with 1M context github.com/MiaAI-Lab/Deep…

English

1.2K

Mia@MiaAI_lab·1h

A PR to vLLM to allow TP=3 for MiniMax M3 👀 His NVFP4 quant is 260GB - lukealonso/MiniMax-M3-NVFP4 Hopefully this will work for anyone with 3x DGX Sparks, 87GB per Spark. github.com/vllm-project/v…

English

Mia@MiaAI_lab·1h

@TTrimoreau The ones who know how to use it the best.

English

Thomas Trimoreau@TTrimoreau·9h

At this point if AI writes 99% of code, who even survives in tech??

English

Mia@MiaAI_lab·1h

@buildwithhassan And it's still underrated, especially the flash.

English

Hassan@buildwithhassan·1d

opencode published their real model usage data. what developers actually run when they're paying for it: 1. deepseek v4 flash: 32T tokens 2. deepseek v4 pro: 19T tokens 3. kimi k2.6: 6.5T tokens deepseek is running more tokens than the next 16 models combined. it's actual usage from developers spending their own money. glm-5.1 grew 419% too. the models winning on price and reliability aren't always the ones winning on twitter.

English

492

32.5K

Mia@MiaAI_lab·1h

@Kimi_Moonshot Can we please have a Kimi k2.7 Flash variant?

English

Kimi.ai@Kimi_Moonshot·2d

🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! 🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite. 🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. 🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. ⚡️ 6x High-Speed Mode coming soon! 🔌 Available today via Kimi API and Kimi Code. 🔗 Kimi Code: kimi.com/code 🔗 API: platform.moonshot.ai

English

614

1.6K

13.6K

1.9M

Mia@MiaAI_lab·1h

@HarshithLucky3 MiniMax M3 Kimi k2.7 GLM 5.2 But honestly, DeepSeek v4 Flash/Pro are underrated, especially the flash.

English

Harshith@HarshithLucky3·8h

1) Kimi K2.7 2) Xiaomi MiMo v2.5 pro

Taniya@Taniyatweets_

be honest, which is the best open source AI model? 1. Qwen 3.5 2. DeepSeek v4 3. GLM-5 4. Kimi K2.6

Čeština

1.5K

Mia@MiaAI_lab·1h

@HuggingModels MiniMax M3 and its big new brothers are pushing me towards adding another two DGX Sparks.

English

Hugging Models@HuggingModels·11h

Imagine a model that can see images, read text, and even understand video. Meet MiniMax-M3, a multimodal MoE powerhouse that's taking AI to the next level. It's not just another LLM, it's a vision, text, and video maestro. #AI #Multimodal

English

3.2K

Mia@MiaAI_lab·1h

@catalinmpit Dual @NVIDIAAI DGX Sparks running DeepSeek v4 Flash at 45 tok/s

English

Catalin@catalinmpit·4h

For those running local models, what’s your machine configuration? I’m thinking of selling my MacBook Pro M4 Max 48GB RAM and building a PC. Then get a MacBook Air for interacting with the LLM from the PC.

English

6.3K

Mia@MiaAI_lab·1h

@mr_r0b0t @DerekColley_ @Tech2Wild @UnslothAI 27b-MTP was my go-to on my 5090. It's very good.

English

mr-r0b0t@mr_r0b0t·3h

@DerekColley_ @Tech2Wild @UnslothAI Have you tried 27B-MTP? The NVFP4 is actually decently fast!

English

Tech2Wild@Tech2Wild·23h

Qwen has disappeared, Minimax has went from 200B -> 400B. So who will save the day for the Single and Dual Sparkers. Deepseek V4.1 ?

English

4.8K

Mia@MiaAI_lab·4h

@0xSero I need to adopt this asap

English

208

0xSero@0xSero·4h

I haven’t seen any political posts in 6 months

English

254

5.5K

Mia@MiaAI_lab·5h

@0xhikigaya @TheVixhal The minimum vram/unified ram for Qwen 3.6 27b would be 24gb, 32gb preferably.

English

Hikigaya☔️@0xhikigaya·5h

@MiaAI_lab @TheVixhal I thought ram matters right like how much ram?

English

vixhaℓ@TheVixhal·6h

One day, Mythos / GPT-5.5 Pro-level models will run locally on my laptop.

English

227

9.1K

Mia@MiaAI_lab·5h

@0xhikigaya @TheVixhal The @NVIDIAAI RTX laptops too when they'll be available.

English

Hikigaya☔️@0xhikigaya·5h

@MiaAI_lab @TheVixhal What specification of laptop u need to run qwen 3.6 27b ?

English

Mia@MiaAI_lab·5h

@TheVixhal Look at how good Qwen 3.6 27b is. It's a Sonnet 4.5 level, at least. Sonnet 4.5 was release on September 29, 2025. Do the math.

English

vixhaℓ@TheVixhal·5h

@MiaAI_lab Hoping you're right... What makes you say it's closer than it seems?

English

284

Mia@MiaAI_lab·5h

@AgentSparko Dual @NVIDIAAI DGX Sparks is the currently the sweet stop. Can run DeepSeek-v4-Flash with 1M context with 45 tok/s. 4 Sparks would let you load the real heavy stuff. github.com/MiaAI-Lab/Deep…

English

AgentSparko 💥@AgentSparko·3d

I said so many times that people sleep on the DGX Spark because DFlash, DDTree, dLLM will fix the memory bandwidth issue and they did not believe me.

stevibe@stevibe

My first reaction: How is that possible? Running DiffusionGemma 26B A4B NVFP4 on my DGX Spark at 161.9 tok/s!

English

2.5K

Mia@MiaAI_lab·6h

@mr_r0b0t @morganlinton @NVIDIAAI Same. I was considering buying 2 RTX 6000s but it's just not justifiable. I'm honestly shocked how satisfied I am with the DGX Sparks.

English

mr-r0b0t@mr_r0b0t·6h

@morganlinton @NVIDIAAI tbh my RTX6000 dreams were shattered with the recent price increase, combined with this sale 😂 crazy to think I'll have 3x GB10 (all 4TB) for less than a new 6000 after the most recent increase 😶‍🌫️😶‍🌫️😶‍🌫️

English

mr-r0b0t@mr_r0b0t·23h

So I did a thing 😁

English

118

6.5K

Mia@MiaAI_lab·8h

Bro living the dream. 1TB of unified ram running Kimi k2.6.

Christian Merrill@M_Chimiste

@MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai This is how it’s currently configured with the weights being stored on dedicated M.2 drives on the side. I probably should change the configuration since I believe it’s slower with them stacked like this but it’s more convenient space wise.

English

155

Mia@MiaAI_lab·8h

@M_Chimiste @QuixiAI @deepseek_ai @StepFun_ai Insane. How much tok/s when in deep context?

English

Christian Merrill@M_Chimiste·8h

@MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai For K2.6, yes with EXO and a thunderbolt cable with RDMA.

English

Mia@MiaAI_lab·1d

DeepSeek-v4-Flash beats Step-3.7-Flash in head-to-head tool calling benchmark. Full results in: github.com/MiaAI-Lab/Deep…

English

2.8K

Mia@MiaAI_lab·8h

Exactly my point. And they are going to IPO soon.

Mia@MiaAI_lab

The publicity @AnthropicAI got from Fable 5 drama is going to create even more demand for it. There is no such thing of bad publicity if your product is good.

English

Tuklasin

@TTrimoreau @buildwithhassan @Kimi_Moonshot @HarshithLucky3 @HuggingModels @catalinmpit @NVIDIAAI @mr_r0b0t