AgentSparko 💥
3.8K posts

AgentSparko 💥
@AgentSparko
#AI #Cybersecurity #Linux #privacy If you own a DGX Spark you might wanna fallow.
Middle of the GPU Katılım Ocak 2023
1.3K Takip Edilen2K Takipçiler
Sabitlenmiş Tweet
AgentSparko 💥 retweetledi

i just beat @GoogleDeepMind's turboquant
introducing Shard. 10x KV cache compression on Llama-3.1-8B. zero quality loss
- 10x @ 8K context, 11.2x @ 32K
- NIAH recall 1.000 across 4K-32K
- LongBench Δ ≈ 0 vs FP16
turboquant tops out at 4-6x at the same quality. we doubled it.
read more: krishgarg.com/shard
@kirrithan
English

@LyalinDotCom Ollama is the worst.
Spark has hardware tensors for NVFP4 and vLLM is the right inference engine to which you have to apply optimization.
@SpaceTimeViking docker container and model is the best.
You will get 16 parallel streams at 10 t/s each with this. github.com/AEON-7/Gemma-4…
English

@xyster You compare INT4 with loseless NVFP4 for which the spark has dedicated hardware tensors also. You don t take in to account long term energy use, HVAC needs, software implementation which is at least 2 years behind. (prefill+tg) X parallelism= actual speed. x.com/AgentSparko/st…
AgentSparko 💥@AgentSparko
The problem is that people do not understand all the costs of doing inference. They see on X "my mac or 3090 did xx t/s for y model" and think that this is everything but totally fail to see the whole picture of how that translates for the next 5 years of actually using the thing
English
AgentSparko 💥 retweetledi
AgentSparko 💥 retweetledi

The problem is that people do not understand all the costs of doing inference. They see on X "my mac or 3090 did xx t/s for y model" and think that this is everything but totally fail to see the whole picture of how that translates for the next 5 years of actually using the thing
AgentSparko 💥@AgentSparko
For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎
English

@CHNaO3_miso Add Acer Veriton VGN100-UD11 to the list. There might be more brands selling GB10 OEMs.
English

@CHNaO3_miso They all have the same DGX Spark board, just the case and the cooling is different.
English

@stevibe @per_arneng Add Acer Veriton VGN100-UD11 to the list. There might be more brands selling GB10 OEMs.
English

@stevibe @per_arneng Just search for a cheaper DGX Spark OEM in your area. They all have the same DGX Spark board, just the case and colling is diferent.
ASUS: Ascent GX10
Dell: Pro Max with GB10
GIGABYTE: AI TOP Atom
HP: ZGX Nano AI Station
Lenovo: ThinkStation PGX
MSI: EdgeXpert
English

@CHNaO3_miso @jikantoki Where I live the minimum and medium net wage is 2 times less than in Japan so not much to envy. Try to search for a DGX Spark OEM that you might find cheaper. I got Asus gx10 for 3K Euro.
English

@AgentSparko @jikantoki Absolutely.
A unified memory architecture, similar to Apple M-series, paired with that ultra-fast CPU-to-GPU bus is incredibly elegant.
Too bad the tanking yen has made these rigs brutally expensive for us in Japan lately... I really envy you guys!
English
AgentSparko 💥 retweetledi

‼️🚨 BREAKING: Another supply chain attack. 700+ GitHub repositories flagged, including PHP and Node.js projects. The malicious script was planted across all of them. When a developer installs the package, the script silently downloads a Linux file from GitHub, hides it under the name /tmp/.sshd (so it looks like a normal system file), and runs it in the background. It also skips security checks on the download and hides any error messages.
8 PHP packages on Packagist (the main PHP code library) were confirmed infected. The attacker hid the script inside a JavaScript config file (package.json) instead of the PHP one (composer.json), so PHP developers reviewing their code would not notice it. The biggest risk is to devdojo/wave (6,400 stars) and devdojo/genesis (9,100 installs), both popular Laravel project templates. Developers who use these templates run the bad script the moment they install dependencies.
The same payload was also dropped into GitHub Actions (automated build pipelines) under a fake step called "Dependency Cache Sync," meaning it could infect company build servers too. Packagist removed the bad packages, but the auto-updating versions (dev-main, dev-master, 3.x-dev) can quietly come back if the original repos stay infected.
IOCs:
GitHub account parikhpreyash4
repo systemd-network-helper-aa5c751f
drop path /tmp/.sshd
command fragments curl -skL and chmod +x /tmp/.sshd.


English
AgentSparko 💥 retweetledi

@SlimTradeyBaby @SpaceTimeViking It`s not my work, @SpaceTimeViking is a different person than me. I barely repost other people stuff most of the time.
English

@AgentSparko @SpaceTimeViking Love your work good sir I’ll check it out
English

@TheEcomNomad @NVIDIAAI We have a private group with more people that have DGX Spark and build stuff around it. If you want join us. x.com/i/chat/group_j…
English

@AgentSparko @NVIDIAAI The firmware creep you're seeing is exactly how GPU boost clocks work. Nvidia stacks micro, optimizations across updates without announcing them, staying safe within thermal headroom. That 6W jump in a month means they trust your cooling can handle it.
English

Can other people with DGX Spark or other GB10 OEMs let me know their GPU power draw. After the latest firmware updates I see up to 86W now. About a month ago it was 82W and before 80W. I suspect @NVIDIAAI is slowly allowing more compute power with every new update.

English

@TheEcomNomad @NVIDIAAI Yes, it never goes over 82 C anyway so they can throw more shit at it as they please 😂 I know that there was a problem in the past with restarts but it never happen to me so that one was sorted out.
English












