recursive_wave がリツイート

I have a dgx spark. It's an amazing device. It gets a lot of criticism from people who haven't used it and don't know any better.
Nvidia isn't stupid. They wouldn't double down on the spark if it wasn't good.
Yes memory bandwidth is low, but the processing power is amazing. Prefill matters more than decode. MTP and other software enhancements improve decode speeds too.
Nvfp4 is really important, but rarely mentioned.
I run models that are every bit as good as Sonnet 4.6 and early Opus versions, over 100 tokens per second decode on the spark. With 256k context window with multiple streams (multiple agents).
Don't listen to the haters. They literally do not know what they are talking about and literally have not used the device. Trust me. This was a hard decision to make. I was hung up on the memory bandwidth thing myself. I have an RTX 3090 that has great memory bandwidth. I use the spark more.
However. I do have to mention that it isn't always so easy to find the right models and run them. There's a learning curve. Sometimes getting the right vllm config options is a challenge or something doesn't work well for tool calling, etc. This has nothing to do with the spark. It's just the current experience around running local AI.
So Nvidia really needs to make sure that out of box experience is better when they release those laptops. Most people don't want to learn about all the config options and spend a lot of time setting things up.
It's a great piece of hardware that's incredibly underrated and often misunderstood. I would currently put it into the research camp though. Without a better out of box experience, you might not like it. If you're into learning and tinkering, it's great.
English













