AgentSparko 💥

3.9K posts

AgentSparko 💥 banner
AgentSparko 💥

AgentSparko 💥

@AgentSparko

#AI #Cybersecurity #Linux #privacy If you own a DGX Spark you might wanna fallow.

Middle of the GPU 参加日 Ocak 2023
1.4K フォロー中2K フォロワー
固定されたツイート
AgentSparko 💥
AgentSparko 💥@AgentSparko·
For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎
AgentSparko 💥 tweet media
English
11
2
21
5.2K
AgentSparko 💥 がリツイート
ÆON FORGE ✨
ÆON FORGE ✨@SpaceTimeViking·
Receipts in video, see it float at ~100-150 while coding the fluctuations were for task and context switching of the model. This thing rips through code! A Single @NVIDIAAI DGX Spark ⚡️
ÆON FORGE ✨@SpaceTimeViking

Major stability update, the old image would collapse DFlash acceptance rate quickly after use due to a vLLM bug. It would drop to as low as 20 Tok/s after initial usage. Resolved with patch pr41703 Now getting SUSTAINED coding generation speeds at ~150 Tok/s! Pull latest now!

English
4
4
32
8.5K
AgentSparko 💥 がリツイート
ÆON FORGE ✨
ÆON FORGE ✨@SpaceTimeViking·
Major stability update, the old image would collapse DFlash acceptance rate quickly after use due to a vLLM bug. It would drop to as low as 20 Tok/s after initial usage. Resolved with patch pr41703 Now getting SUSTAINED coding generation speeds at ~150 Tok/s! Pull latest now!
ÆON FORGE ✨ tweet media
ÆON FORGE ✨@SpaceTimeViking

So I've been validating my models with the latest version of my DGX Spark / Blackwell optimized vLLM container, and floored by the benchmark results I just got with my Gemma 4 26B A4B model 144 Tok/s on coding! over 1700 Tok/s agg with 128 c! Get the latest container and recipe now! github.com/AEON-7/Gemma-4…

English
3
4
26
7.2K
AgentSparko 💥 がリツイート
Photographer
Photographer@photo5065·
ZXX
82
499
6.1K
502K
AgentSparko 💥 がリツイート
Anthropic
Anthropic@AnthropicAI·
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
English
12.4K
25.7K
87.6K
88.6M
AgentSparko 💥 がリツイート
Tech2Wild
Tech2Wild@Tech2Wild·
In the document here MiniMax mentions a 109B MoE model and open-sourced the sparse attention kernel behind it. 28.4x less compute at 1M context, 14.2x faster prefill, 7.6x faster decode, and it matches full attention on benchmarks. Is Minimax 3 going to be even smaller ?
RyanLee@RyanLeeMiniMax

Hey everyone — our high-performance MSA kernel library is now open-source. The M3 weights are expected to drop this Friday. Thanks for waiting! Github: github.com/MiniMax-AI/MSA Paper:github.com/MiniMax-AI/MSA…

English
1
1
15
2K
AgentSparko 💥 がリツイート
noname
noname@malikwas1f·
Upto 1100 tps on RTX 3090x2 for Diffusion Gemma 4 26B. Unleash this mini monster on your gpus now! If you are running nvidia gpus locally, come grab the recipe at club-3090. github.com/noonghunna/clu… P.S. a ⭐️ on Github is much appreciated. @googlegemma @vllm_project
English
9
5
65
11.1K
AgentSparko 💥 がリツイート
DROID
DROID@droidbuilds·
"mom, how did we get so poor?" "your father had Claude Max, ChatGPT Pro, Cursor Pro and shipped absolutely nothing"
DROID tweet media
English
295
935
13.8K
699.6K
AgentSparko 💥
AgentSparko 💥@AgentSparko·
For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎
AgentSparko 💥 tweet media
English
11
2
21
5.2K
AgentSparko 💥 がリツイート
ÆON FORGE ✨
ÆON FORGE ✨@SpaceTimeViking·
LOCAL LLM Persona built with my AI person builder, now supports LIVE VIDEO calling. Watch as Local AI Terence McKenna gazes upon his own silicon mind. Running on @GoogleAI Gemma 4 26B-A4B-Aeon He seems to greatly admire the craftsmanship of the @NVIDIAAI DGX Spark Links⤵️
English
8
4
65
6.1K
AgentSparko 💥 がリツイート
AgentSparko 💥 がリツイート
Terp
Terp@OnlyTerp·
I spent a ton of time having Fable 5 Extra High train and runn tests on Nemotron 3 Ultra & I just dropped MASSIVE improvements on the github 🫡 I had a theory that if I could force proper reasoning chains & steps where he grounds himself with source of truth (usually web search or memory search) if a confidence score is under a certain % - that he would be able to perform Significantly more reliable for everything I ran more benchmarks to prove this & the tests look great, Go ahead and try it & tell me what you think, Should be helpful data for fine tuning, I'm gonna keep adding to it and see where the actual limit is for prompting before a fine tune is needed, the model gets 1m context so i will make the prompt as long as i need until i get the best possible results, then the goal is to keep those results and shrink it as much as possible, then once we are min maxed, should be good data to fine tune the model 😙 500 TPS with blackbox for $20 a month is what I'm using for Nemotron 🫣
Terp tweet media
Terp@OnlyTerp

I had a theory that a lot of Nemotron 3 Ultra's issues could be fixed without fine tuning After running tests, I got a reliable & warm personality out of it using just one 353 word system prompt As usual, I've fully open-sourced my research for you github.com/OnlyTerp/nemot… 🔥

English
2
1
30
3.2K
AgentSparko 💥 がリツイート
Sayak Paul
Sayak Paul@RisingSayak·
We want to work with kernel developers to help them publish their cool kernels on the @huggingface Hub via🤗 Kernels. This has several advantages: * A consistent build structure * Extreme ease of use * Standardized distribution * Reproducibility Reach out if interested 🤗
Sayak Paul tweet media
English
5
6
66
9.7K
ÆON FORGE ✨
ÆON FORGE ✨@SpaceTimeViking·
So I've been validating my models with the latest version of my DGX Spark / Blackwell optimized vLLM container, and floored by the benchmark results I just got with my Gemma 4 26B A4B model 144 Tok/s on coding! over 1700 Tok/s agg with 128 c! Get the latest container and recipe now! github.com/AEON-7/Gemma-4…
ÆON FORGE ✨ tweet media
ÆON FORGE ✨@SpaceTimeViking

vLLM major update supports all the DGX Spark +Blackwell optimizations, plus now with support for NVFP4 kv cache get 2x-4x boost in model context capacity! If using NVFP4 KV Cache you will have to opt for MTP instead of DFlash but worth it for large Agent swarms. github.com/AEON-7/vllm-ul…

English
6
4
66
14.4K