AgentSparko 💥

3.8K posts

AgentSparko 💥

@AgentSparko

#AI #Cybersecurity #Linux #privacy If you own a DGX Spark you might wanna fallow.

Middle of the GPU Katılım Ocak 2023

1.3K Takip Edilen2K Takipçiler

Sabitlenmiş Tweet

AgentSparko 💥@AgentSparko·31 Mar

For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎

English

AgentSparko 💥 retweetledi

Krish@krishgarg·6h

i just beat @GoogleDeepMind's turboquant introducing Shard. 10x KV cache compression on Llama-3.1-8B. zero quality loss - 10x @ 8K context, 11.2x @ 32K - NIAH recall 1.000 across 4K-32K - LongBench Δ ≈ 0 vs FP16 turboquant tops out at 4-6x at the same quality. we doubled it. read more: krishgarg.com/shard @kirrithan

English

601

37.3K

AgentSparko 💥@AgentSparko·4h

@LyalinDotCom Ollama is the worst. Spark has hardware tensors for NVFP4 and vLLM is the right inference engine to which you have to apply optimization. @SpaceTimeViking docker container and model is the best. You will get 16 parallel streams at 10 t/s each with this. github.com/AEON-7/Gemma-4…

English

Dmitry Lyalin@LyalinDotCom·8h

x.com/i/article/2058…

ZXX

4.6K

AgentSparko 💥@AgentSparko·1d

@xyster x.com/mr_r0b0t/statu…

mr-r0b0t@mr_r0b0t

16 local AI agents streaming at once! MiniMax M2.7 NVFP4 — 2x GB10, no cloud APIs.

QME

AgentSparko 💥@AgentSparko·1d

@xyster You compare INT4 with loseless NVFP4 for which the spark has dedicated hardware tensors also. You don t take in to account long term energy use, HVAC needs, software implementation which is at least 2 years behind. (prefill+tg) X parallelism= actual speed. x.com/AgentSparko/st…

AgentSparko 💥@AgentSparko

The problem is that people do not understand all the costs of doing inference. They see on X "my mac or 3090 did xx t/s for y model" and think that this is everything but totally fail to see the whole picture of how that translates for the next 5 years of actually using the thing

English

Steve💙🇨🇦@xyster·1d

If 2x DGX Sparks can hit 25-tps on Minimax, what is the theoretical limit for 4x Intel B70s? I hit 93 tps, which is spot on with what ChatGPT suggested was the likely optimized outcome. Intel is over 3x faster, and significantly cheaper. I paid $5300 USD for the Intel build.

English

129

37.8K

AgentSparko 💥 retweetledi

mr-r0b0t@mr_r0b0t·2d

We're official!

mr-r0b0t@mr_r0b0t

It’s official! @NVIDIAAI / MiniMax-M2.7-NVFP4 Optimized specifically for your SM120/121 DGX Spark (GB10) and RTX 6000/5090 Blackwell tensor cores! Full native FlashInfer/CUTLASS Finalizing the benchmarks and documentation now 😁

English

6.7K

AgentSparko 💥 retweetledi

12GB VRAMの救世主！MoEモデルを爆速化する「Experts first llama.cpp」 MoE（混合専門家）モデルの推論を劇的に効率化する画期的な手法が登場しました！✨ 全パラメータではなく、よく使う「専門家」だけをVRAMにピン留めすることで、無駄を徹底的に排除。RTX 3060や4060などの環境でも、推論速度が2倍以上に爆速化する驚異のパフォーマンスを記録しています！🚀 Qwen 35BやGemma 26Bといった大型モデルを、安価なコンシューマーPCでサクサク動かせる革命的ツール。まさにVRAMの制約を打ち破る救世主です。 #LLM #AI

ハカセアイ(Ai-Hakase)🐾最新トレンドＡＩのためのＸ 🐾 tweet media

日本語

151

10.2K

AgentSparko 💥@AgentSparko·2d

@NicW_AI What you have in mind ? DM me.

English

Nic Wienandt@NicW_AI·2d

@AgentSparko Can we get together and talk about cooking?

English

AgentSparko 💥@AgentSparko·3d

AgentSparko 💥@AgentSparko

English

223

AgentSparko 💥@AgentSparko·2d

@CHNaO3_miso Add Acer Veriton VGN100-UD11 to the list. There might be more brands selling GB10 OEMs.

English

AgentSparko 💥@AgentSparko·2d

@CHNaO3_miso They all have the same DGX Spark board, just the case and the cooling is different.

English

ときえのき/Web屋とボカロP@jikantoki·4d

というかグラボ、なんでもかんでもできる万能パーツになりすぎてていよいよCPU外してもOS動きそうだよな

日本語

1.7K

186.7K

AgentSparko 💥@AgentSparko·2d

@stevibe @per_arneng Add Acer Veriton VGN100-UD11 to the list. There might be more brands selling GB10 OEMs.

English

AgentSparko 💥@AgentSparko·2d

@stevibe @per_arneng Just search for a cheaper DGX Spark OEM in your area. They all have the same DGX Spark board, just the case and colling is diferent. ASUS: Ascent GX10 Dell: Pro Max with GB10 GIGABYTE: AI TOP Atom HP: ZGX Nano AI Station Lenovo: ThinkStation PGX MSI: EdgeXpert

English

106

stevibe@stevibe·3d

I already own a DGX Spark, but this AMD Ryzen AI Halo has me seriously intrigued. Obsessed with mini all‑in‑one boxes that use little power but are still super powerful. But if the price is similar, why not just get another DGX Spark and link them up?

English

105

10.7K

AgentSparko 💥@AgentSparko·3d

@CHNaO3_miso @jikantoki Where I live the minimum and medium net wage is 2 times less than in Japan so not much to envy. Try to search for a DGX Spark OEM that you might find cheaper. I got Asus gx10 for 3K Euro.

English

156

重曹@ぶいてく@CHNaO3_miso·3d

@AgentSparko @jikantoki Absolutely. A unified memory architecture, similar to Apple M-series, paired with that ultra-fast CPU-to-GPU bus is incredibly elegant. Too bad the tanking yen has made these rigs brutally expensive for us in Japan lately... I really envy you guys!

English

AgentSparko 💥 retweetledi

International Cyber Digest@IntCyberDigest·3d

‼️🚨 BREAKING: Another supply chain attack. 700+ GitHub repositories flagged, including PHP and Node.js projects. The malicious script was planted across all of them. When a developer installs the package, the script silently downloads a Linux file from GitHub, hides it under the name /tmp/.sshd (so it looks like a normal system file), and runs it in the background. It also skips security checks on the download and hides any error messages. 8 PHP packages on Packagist (the main PHP code library) were confirmed infected. The attacker hid the script inside a JavaScript config file (package.json) instead of the PHP one (composer.json), so PHP developers reviewing their code would not notice it. The biggest risk is to devdojo/wave (6,400 stars) and devdojo/genesis (9,100 installs), both popular Laravel project templates. Developers who use these templates run the bad script the moment they install dependencies. The same payload was also dropped into GitHub Actions (automated build pipelines) under a fake step called "Dependency Cache Sync," meaning it could infect company build servers too. Packagist removed the bad packages, but the auto-updating versions (dev-main, dev-master, 3.x-dev) can quietly come back if the original repos stay infected. IOCs: GitHub account parikhpreyash4 repo systemd-network-helper-aa5c751f drop path /tmp/.sshd command fragments curl -skL and chmod +x /tmp/.sshd.

English

559

3.2K

237.5K

AgentSparko 💥 retweetledi

stevibe@stevibe·3d

DeepSeek just permanently dropped V4 Pro to $0.44/$0.87 per 1M tokens. The cheapest frontier model costs $0.87 per million output tokens. The most expensive: $30. That's a 34x gap. Chinese labs just keep pushing harder on economics.

English

275

27.6K

AgentSparko 💥@AgentSparko·3d

@SlimTradeyBaby @SpaceTimeViking It`s not my work, @SpaceTimeViking is a different person than me. I barely repost other people stuff most of the time.

English

BlackwellBoy@SlimTradeyBaby·3d

@AgentSparko @SpaceTimeViking Love your work good sir I’ll check it out

English

AgentSparko 💥@AgentSparko·31 Mar

English

AgentSparko 💥@AgentSparko·3d

@TheEcomNomad @NVIDIAAI We have a private group with more people that have DGX Spark and build stuff around it. If you want join us. x.com/i/chat/group_j…

English

Aaron ⚡️@TheEcomNomad·3d

@AgentSparko @NVIDIAAI The firmware creep you're seeing is exactly how GPU boost clocks work. Nvidia stacks micro, optimizations across updates without announcing them, staying safe within thermal headroom. That 6W jump in a month means they trust your cooling can handle it.

English

AgentSparko 💥@AgentSparko·3d

Can other people with DGX Spark or other GB10 OEMs let me know their GPU power draw. After the latest firmware updates I see up to 86W now. About a month ago it was 82W and before 80W. I suspect @NVIDIAAI is slowly allowing more compute power with every new update.

English

273

AgentSparko 💥@AgentSparko·3d

@TheEcomNomad @NVIDIAAI Yes, it never goes over 82 C anyway so they can throw more shit at it as they please 😂 I know that there was a problem in the past with restarts but it never happen to me so that one was sorted out.

English

AgentSparko 💥@AgentSparko·3d

power bills, HVAC needs, ultrasonic cleaning for the video cards, space, GPU generation, CUDA, how many people build stuff for your hardware and how much time you can put in to this and if you have the knowledge .... There are so many aspects to put in balance.

English

AgentSparko 💥@AgentSparko·3d

aspects like model availability, quants, inference engines, actual total throughput when you take in to account everything like (prefill speed + generations speed) X parallelism

English

Keşfet

@GoogleDeepMind @kirrithan @LyalinDotCom @SpaceTimeViking @xyster @NicW_AI @CHNaO3_miso @stevibe