AgentSparko 💥

3.8K posts

AgentSparko 💥 banner
AgentSparko 💥

AgentSparko 💥

@AgentSparko

#AI #Cybersecurity #Linux #privacy If you own a DGX Spark you might wanna fallow.

Middle of the GPU Katılım Ocak 2023
1.3K Takip Edilen2K Takipçiler
Sabitlenmiş Tweet
AgentSparko 💥
AgentSparko 💥@AgentSparko·
For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎
AgentSparko 💥 tweet media
English
10
2
20
5K
AgentSparko 💥 retweetledi
Krish
Krish@krishgarg·
i just beat @GoogleDeepMind's turboquant introducing Shard. 10x KV cache compression on Llama-3.1-8B. zero quality loss - 10x @ 8K context, 11.2x @ 32K - NIAH recall 1.000 across 4K-32K - LongBench Δ ≈ 0 vs FP16 turboquant tops out at 4-6x at the same quality. we doubled it. read more: krishgarg.com/shard @kirrithan
English
45
55
601
37.3K
AgentSparko 💥
AgentSparko 💥@AgentSparko·
@xyster You compare INT4 with loseless NVFP4 for which the spark has dedicated hardware tensors also. You don t take in to account long term energy use, HVAC needs, software implementation which is at least 2 years behind. (prefill+tg) X parallelism= actual speed. x.com/AgentSparko/st…
AgentSparko 💥@AgentSparko

The problem is that people do not understand all the costs of doing inference. They see on X "my mac or 3090 did xx t/s for y model" and think that this is everything but totally fail to see the whole picture of how that translates for the next 5 years of actually using the thing

English
1
0
0
97
Steve💙🇨🇦
If 2x DGX Sparks can hit 25-tps on Minimax, what is the theoretical limit for 4x Intel B70s? I hit 93 tps, which is spot on with what ChatGPT suggested was the likely optimized outcome. Intel is over 3x faster, and significantly cheaper. I paid $5300 USD for the Intel build.
Steve💙🇨🇦 tweet media
English
24
12
129
37.8K
AgentSparko 💥 retweetledi
mr-r0b0t
mr-r0b0t@mr_r0b0t·
We're official!
mr-r0b0t tweet media
mr-r0b0t@mr_r0b0t

It’s official! @NVIDIAAI / MiniMax-M2.7-NVFP4 Optimized specifically for your SM120/121 DGX Spark (GB10) and RTX 6000/5090 Blackwell tensor cores! Full native FlashInfer/CUTLASS Finalizing the benchmarks and documentation now 😁

English
10
4
83
6.7K
AgentSparko 💥 retweetledi
ハカセ アイ(Ai-Hakase)🐾最新トレンドAIのためのX 🐾
12GB VRAMの救世主!MoEモデルを爆速化する「Experts first llama.cpp」 MoE(混合専門家)モデルの推論を劇的に効率化する画期的な手法が登場しました!✨ 全パラメータではなく、よく使う「専門家」だけをVRAMにピン留めすることで、無駄を徹底的に排除。RTX 3060や4060などの環境でも、推論速度が2倍以上に爆速化する驚異のパフォーマンスを記録しています!🚀 Qwen 35BやGemma 26Bといった大型モデルを、安価なコンシューマーPCでサクサク動かせる革命的ツール。まさにVRAMの制約を打ち破る救世主です。 #LLM #AI
ハカセ アイ(Ai-Hakase)🐾最新トレンドAIのためのX 🐾 tweet media
日本語
7
17
151
10.2K
AgentSparko 💥
AgentSparko 💥@AgentSparko·
The problem is that people do not understand all the costs of doing inference. They see on X "my mac or 3090 did xx t/s for y model" and think that this is everything but totally fail to see the whole picture of how that translates for the next 5 years of actually using the thing
AgentSparko 💥@AgentSparko

For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎

English
3
0
0
223
AgentSparko 💥
AgentSparko 💥@AgentSparko·
@CHNaO3_miso Add Acer Veriton VGN100-UD11 to the list. There might be more brands selling GB10 OEMs.
English
0
0
1
25
AgentSparko 💥
AgentSparko 💥@AgentSparko·
@CHNaO3_miso They all have the same DGX Spark board, just the case and the cooling is different.
English
1
0
1
57
ときえのき/Web屋とボカロP
というかグラボ、なんでもかんでもできる万能パーツになりすぎてていよいよCPU外してもOS動きそうだよな
日本語
21
66
1.7K
186.7K
AgentSparko 💥
AgentSparko 💥@AgentSparko·
@stevibe @per_arneng Just search for a cheaper DGX Spark OEM in your area. They all have the same DGX Spark board, just the case and colling is diferent. ASUS: Ascent GX10 Dell: Pro Max with GB10 GIGABYTE: AI TOP Atom HP: ZGX Nano AI Station Lenovo: ThinkStation PGX MSI: EdgeXpert
English
1
0
1
106
stevibe
stevibe@stevibe·
I already own a DGX Spark, but this AMD Ryzen AI Halo has me seriously intrigued. Obsessed with mini all‑in‑one boxes that use little power but are still super powerful. But if the price is similar, why not just get another DGX Spark and link them up?
stevibe tweet media
English
30
1
105
10.7K
AgentSparko 💥
AgentSparko 💥@AgentSparko·
@CHNaO3_miso @jikantoki Where I live the minimum and medium net wage is 2 times less than in Japan so not much to envy. Try to search for a DGX Spark OEM that you might find cheaper. I got Asus gx10 for 3K Euro.
English
1
0
2
156
重曹@ぶいてく
重曹@ぶいてく@CHNaO3_miso·
@AgentSparko @jikantoki Absolutely. A unified memory architecture, similar to Apple M-series, paired with that ultra-fast CPU-to-GPU bus is incredibly elegant. Too bad the tanking yen has made these rigs brutally expensive for us in Japan lately... I really envy you guys!
English
1
0
1
80
AgentSparko 💥 retweetledi
International Cyber Digest
International Cyber Digest@IntCyberDigest·
‼️🚨 BREAKING: Another supply chain attack. 700+ GitHub repositories flagged, including PHP and Node.js projects. The malicious script was planted across all of them. When a developer installs the package, the script silently downloads a Linux file from GitHub, hides it under the name /tmp/.sshd (so it looks like a normal system file), and runs it in the background. It also skips security checks on the download and hides any error messages. 8 PHP packages on Packagist (the main PHP code library) were confirmed infected. The attacker hid the script inside a JavaScript config file (package.json) instead of the PHP one (composer.json), so PHP developers reviewing their code would not notice it. The biggest risk is to devdojo/wave (6,400 stars) and devdojo/genesis (9,100 installs), both popular Laravel project templates. Developers who use these templates run the bad script the moment they install dependencies. The same payload was also dropped into GitHub Actions (automated build pipelines) under a fake step called "Dependency Cache Sync," meaning it could infect company build servers too. Packagist removed the bad packages, but the auto-updating versions (dev-main, dev-master, 3.x-dev) can quietly come back if the original repos stay infected. IOCs: GitHub account parikhpreyash4 repo systemd-network-helper-aa5c751f drop path /tmp/.sshd command fragments curl -skL and chmod +x /tmp/.sshd.
International Cyber Digest tweet mediaInternational Cyber Digest tweet media
English
78
559
3.2K
237.5K
AgentSparko 💥 retweetledi
stevibe
stevibe@stevibe·
DeepSeek just permanently dropped V4 Pro to $0.44/$0.87 per 1M tokens. The cheapest frontier model costs $0.87 per million output tokens. The most expensive: $30. That's a 34x gap. Chinese labs just keep pushing harder on economics.
stevibe tweet media
English
14
18
275
27.6K
AgentSparko 💥
AgentSparko 💥@AgentSparko·
For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎
AgentSparko 💥 tweet media
English
10
2
20
5K
Aaron ⚡️
Aaron ⚡️@TheEcomNomad·
@AgentSparko @NVIDIAAI The firmware creep you're seeing is exactly how GPU boost clocks work. Nvidia stacks micro, optimizations across updates without announcing them, staying safe within thermal headroom. That 6W jump in a month means they trust your cooling can handle it.
English
2
1
1
39
AgentSparko 💥
AgentSparko 💥@AgentSparko·
Can other people with DGX Spark or other GB10 OEMs let me know their GPU power draw. After the latest firmware updates I see up to 86W now. About a month ago it was 82W and before 80W. I suspect @NVIDIAAI is slowly allowing more compute power with every new update.
AgentSparko 💥 tweet media
English
2
1
4
273
AgentSparko 💥
AgentSparko 💥@AgentSparko·
@TheEcomNomad @NVIDIAAI Yes, it never goes over 82 C anyway so they can throw more shit at it as they please 😂 I know that there was a problem in the past with restarts but it never happen to me so that one was sorted out.
English
0
0
1
22
AgentSparko 💥
AgentSparko 💥@AgentSparko·
power bills, HVAC needs, ultrasonic cleaning for the video cards, space, GPU generation, CUDA, how many people build stuff for your hardware and how much time you can put in to this and if you have the knowledge .... There are so many aspects to put in balance.
English
0
0
0
29
AgentSparko 💥
AgentSparko 💥@AgentSparko·
aspects like model availability, quants, inference engines, actual total throughput when you take in to account everything like (prefill speed + generations speed) X parallelism
English
0
0
0
21