
i've run a stack of models across a single 3090, a 5090, and a 128GB DGX Spark. exactly three are worth building on. the honest list. the three worth it: > 1. StepFun Step-3.5 Flash, the REAP pruned 121B MoE (Q6, DGX Spark) a 121 billion parameter mixture of experts running on a single desktop box. the most worth-it model in everything i've tested. > 2. Qwen 3.6 27B Dense, Q4 (single RTX 3090) the undisputed king of the 24GB tier. one shot a playable game, around 41 tok/s, fits with context headroom to spare. one 24GB card, this is your answer. > 3. NVIDIA Nemotron 3 Nano Omni, 30B-A3B (DGX Spark) the best multimodal i've tested for video classification work. vision in, runs clean on the Spark. the rest, ran them, they hold up fine: on the Spark: DeepSeek V4 Flash 158B, GLM 4.7 Flash, GLM 4.5 Air REAP 82B-A12B, Gemma 4 26B-A4B, Qwen3-VL 235B-A22B, Qwen3 Coder 30B-A3B, Qwen3 30B-A3B, Carnice 35B-A3B. on consumer GPUs: Kimi K2.5 1T, Qwen3-Coder-Next 80B, Hermes 4.3 36B, Qwen 3.5 27B Dense. single 3090 to a 128GB Spark, that's the range. the three up top are the ones worth your hardware today.

















