AgentSparko 💥
3.9K posts

AgentSparko 💥
@AgentSparko
#AI #Cybersecurity #Linux #privacy If you own a DGX Spark you might wanna fallow.

ULTRACODE-SHIM IS NOW LIVE 🔥 You can now run ANY model in UltraCode I built a github repo to make this really easy for you, Just send your agent there and let him COOK You deserve the flexibility to use LOCAL models & cost efficient models. So I made that happen for you 🫶


Hey everyone — our high-performance MSA kernel library is now open-source. The M3 weights are expected to drop this Friday. Thanks for waiting! Github: github.com/MiniMax-AI/MSA Paper:github.com/MiniMax-AI/MSA…


My first reaction: How is that possible? Running DiffusionGemma 26B A4B NVFP4 on my DGX Spark at 161.9 tok/s!

DiffusionGemma, our experimental open model released under an Apache 2.0 license, explores text diffusion, an exceptionally fast approach to text generation. Here’s how DiffusionGemma accelerates development: + Faster token output: By shifting the bottleneck from memory bandwidth to raw compute, the model generates up to 4x faster token output on dedicated GPUs + Accessible hardware footprint: Activates just 3.8B parameters during inference, fitting comfortably within 24GB-VRAM high-end consumer GPUs when quantized + Novel workflows: Parallel token generation enables self-correction, making it ideal for code infilling, in-line editing, and non-linear structures DiffusionGemma prioritizes speed over raw quality and accelerates best on compute-bound hardware (like @NVIDIAAI GPUs). Standard @GoogleGemma 4 remains recommended for production quality and memory-bound devices.


I had a theory that a lot of Nemotron 3 Ultra's issues could be fixed without fine tuning After running tests, I got a reliable & warm personality out of it using just one 353 word system prompt As usual, I've fully open-sourced my research for you github.com/OnlyTerp/nemot… 🔥

DeepSeek 开始蒸馏 Fable 和 Mythos 了,很快会以 1% 的价格给大家用。




The beast is roaring.



vLLM major update supports all the DGX Spark +Blackwell optimizations, plus now with support for NVFP4 kv cache get 2x-4x boost in model context capacity! If using NVFP4 KV Cache you will have to opt for MTP instead of DFlash but worth it for large Agent swarms. github.com/AEON-7/vllm-ul…

A mountain lioness appears to attack this explorer after a spring snowstorm; a story with many twists and turns…….






