Mia

311 posts

Mia banner
Mia

Mia

@MiaAI_lab

Local AI, LLMs, tech thinker & builder

Joined Temmuz 2022
191 Following193 Followers
Mia
Mia@MiaAI_lab·
@mr_r0b0t @morganlinton @NVIDIAAI Same. I was considering buying 2 RTX 6000s but it's just not justifiable. I'm honestly shocked how satisfied I am with the DGX Sparks.
English
0
0
1
5
mr-r0b0t
mr-r0b0t@mr_r0b0t·
@morganlinton @NVIDIAAI tbh my RTX6000 dreams were shattered with the recent price increase, combined with this sale 😂 crazy to think I'll have 3x GB10 (all 4TB) for less than a new 6000 after the most recent increase 😶‍🌫️😶‍🌫️😶‍🌫️
English
2
0
2
18
mr-r0b0t
mr-r0b0t@mr_r0b0t·
So I did a thing 😁
mr-r0b0t tweet media
English
35
0
114
5.8K
Mia
Mia@MiaAI_lab·
Bro living the dream. 1TB of unified ram running Kimi k2.6.
Christian Merrill@M_Chimiste

@MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai This is how it’s currently configured with the weights being stored on dedicated M.2 drives on the side. I probably should change the configuration since I believe it’s slower with them stacked like this but it’s more convenient space wise.

English
0
0
0
95
Mia
Mia@MiaAI_lab·
DeepSeek-v4-Flash beats Step-3.7-Flash in head-to-head tool calling benchmark. Full results in: github.com/MiaAI-Lab/Deep…
Mia tweet media
English
10
0
35
2.7K
Mia
Mia@MiaAI_lab·
Exactly my point. And they are going to IPO soon.
Mia tweet media
Mia@MiaAI_lab

The publicity @AnthropicAI got from Fable 5 drama is going to create even more demand for it. There is no such thing of bad publicity if your product is good.

English
0
0
0
23
Mia
Mia@MiaAI_lab·
@RaulWesche Good to see. It's a beast, and my go-to currently. Enjoy
English
0
0
1
15
Raul Wesche
Raul Wesche@RaulWesche·
@MiaAI_lab I’ve been using your ds4-flash you posted a few days ago for dual sparks and it’s awesome
English
1
0
1
21
Mia
Mia@MiaAI_lab·
I find default DS4-Flash temperature is giving it "too" much creativity for coding. It sometimes does thing I don't ask. Going to run it with temp 0.6 and top_p 0.95 for a while and compare.
English
0
0
7
289
kache
kache@yacineMTB·
I wonder how much money anthropic is losing every day they don't have fable available. Probably a lot. I'm beginning to actually feel sorry for them..
English
126
8
727
43.4K
Mia
Mia@MiaAI_lab·
The publicity @AnthropicAI got from Fable 5 drama is going to create even more demand for it. There is no such thing of bad publicity if your product is good.
English
0
0
0
142
mr-r0b0t
mr-r0b0t@mr_r0b0t·
2x will cover many many use cases! The third one should help me train larger models, something that could easily be done more quickly by renting a B200 cloud instance. Given I still have much to learn, renting GPUs could/would become expensive very quickly, and any learning (read mistakes) would be quite costly!
English
1
0
0
14
RobbiewOnline
RobbiewOnline@RobbiewOnline·
@MiaAI_lab RepoPrompt looks like a smart way to optimize token usage when working with LLMs, something we dive deep into in the book. The cost of blindly pasting entire files adds up fast (often £1-4 per session), so selective file inclusion is key. Your XML output approach could pair well with local model routing strategies covered in Chapter 2. amazon.com/dp/B0GYDV3FXD If this was helpful, I’d appreciate a repost and a follow. I’ll be sharing more insights from the book, plus what I’ve learned from applying them in the real world, over the coming weeks.
RobbiewOnline tweet mediaRobbiewOnline tweet mediaRobbiewOnline tweet mediaRobbiewOnline tweet media
English
1
0
0
8
Eric Hartford
Eric Hartford@QuixiAI·
@MiaAI_lab @deepseek_ai Deepseek v4 Flash is text-only, 284B @StepFun_ai Step 3.7 Flash is a Text + Vision model, 198B The vision and the smaller size are more appealing. I choose Step 3.7 Flash.
English
2
2
10
1.1K
Mia
Mia@MiaAI_lab·
@anuntrapid_auto @0xSero Yeah 4 is the way. No going to commit to 3 sparks if I'm not planning to get the 4th one.
English
0
0
1
6
0xSero
0xSero@0xSero·
MiniMax-M3-NVFP4 running on 4x RTX PRO 6000 Repo coming soon.
0xSero tweet media
English
14
5
157
7.7K
Theo - t3.gg
Theo - t3.gg@theo·
@WIRED Why are you posting 4 day old articles that are no longer accurate lol
English
43
7
2K
29.5K
WIRED
WIRED@WIRED·
Anthropic is releasing Claude Mythos 5 to trusted organizations and Claude Fable 5 to the public, a version it says can’t be used for cyberattacks. wired.com/story/anthropi…
English
67
26
294
85.1K
Mia
Mia@MiaAI_lab·
@mr_r0b0t @Tech2Wild @NVIDIAAI 3 can't be used for TP though... you want it for more concurrent sessions and more kv cache?
English
0
0
1
21
mr-r0b0t
mr-r0b0t@mr_r0b0t·
@Tech2Wild @NVIDIAAI 3 is a big unlock tho ngl! Bet you're already thinking about 4 and a switch tho 😛🤩 Imma try to surprise those urges for a bit
English
4
0
1
154
Mia
Mia@MiaAI_lab·
@joaosump @0xSero Yes, TP works in pairs, 2,4, etc. 3 units would be good more concurrencies and kv-cache.
English
0
0
0
10
Vieirowski
Vieirowski@joaosump·
@MiaAI_lab @0xSero Damn, that's nice to know, have been considering 2x and thought that was the ceiling. 3x would be an odd number though for vllm right? :(
English
1
0
0
34