AgentSparko 💥

4K posts

AgentSparko 💥 banner
AgentSparko 💥

AgentSparko 💥

@AgentSparko

#AI #Cybersecurity #Linux #privacy If you own a DGX Spark you might wanna fallow.

Middle of the GPU 加入时间 Ocak 2023
1.4K 关注2K 粉丝
置顶推文
AgentSparko 💥
AgentSparko 💥@AgentSparko·
For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎
AgentSparko 💥 tweet media
English
11
2
21
5.2K
AgentSparko 💥 已转推
How To Prompt
How To Prompt@HowToPrompt__·
Researchers show that Claude Code is 98% not AI. Anthropic never gave us the architecture for Claude Code. There were no docs. Just a tool that every developer is currently obsessing over. Until it leaked recently. A research team pulled the source code, analyzed all 500,000 lines, and found something ridiculous. Only 1.6% of the codebase actually interacts with the AI model. The core of Claude Code is literally just a simple while-loop. It asks the model what to do, runs a tool, and repeats. So what is the other 98.4%? It is hardcore, traditional software engineering. The researchers found a massive, complex infrastructure designed entirely to babysit the AI and keep it from hallucinating or destroying your computer: - A 7-mode permission system acting as a security bouncer. - A 5-layer context compaction pipeline so the AI doesn't forget its goal. - A subagent delegation mechanism with strict worktree isolation. - Four different extensibility hooks to manage external tools safely. Every startup right now is trying to build a better AI model to get better results. Anthropic did the exact opposite. They took an existing model and built a fortress of deterministic software around it. They realized that the AI doesn't need to be smarter. It needs to be managed.
How To Prompt tweet media
English
61
150
761
68.3K
AgentSparko 💥 已转推
Tech2Wild
Tech2Wild@Tech2Wild·
✅ Repo pushed — all updates are live. Commit eb12c02 on github.com/tonyd2wild/min…: • Phase 3 (RoCE) flipped from "WIP / err-110 blocked" → "SOLVED 2026-06-15" with the full recipe • Both fixes documented: NCCL v2.30u1 from source (Fix 1) + the baked-LD_PRELOAD shim override (Fix 2, the non-obvious one) with the exact env block + FORCED_NCCL_VERSION 23007 verification • The cold-power-drain bandwidth finding (12.8 → 111.85 Gb/s, credited mashie) • Honest RESULTS block (~10.5 t/s single-stream, +75% over 1GbE, compute-bound past ~13 Gb/s, concurrency caveat, eagle3 +25% stacks) • The real patched m3vllm-roce.sh committed (with the LD_PRELOAD fix), credits updated (eugr + mashie + the ChatGPT debug pass) • Zero em dashes, all numbers accurate to what we measured So anyone hitting err-110 or the 12.8 cap now has the answer. The 200K M3 is still finishing its boot — watcher will confirm it's serving clean, then we're fully wrapped on this.
English
0
3
8
202
AgentSparko 💥 已转推
Charles Curran
Charles Curran@charliebcurran·
I used AI to explain the Anthropic drama to my girlfriend, with fruit.
English
288
502
7.7K
1.1M
AgentSparko 💥
AgentSparko 💥@AgentSparko·
@sudoingX AMD is actually more expensive than Spark if you get a Spark OEM like Asus GX10 and you also have high speed connectivity for clustering, CUDA and software compatibility. Also forcing the test on llama.cpp and GGUF only is not peak performance or quality for NVIDIA.
English
0
0
2
580
Sudo su
Sudo su@sudoingX·
nvidia vs amd two boxes on my desk, both 128gb of unified memory. one is the nvidia dgx spark ($4,699). the other is the amd strix halo ($1,999), amd at roughly half the price. i'm running the exact same models on both, from a 3b all the way up to a 397b, same quants, same llama.cpp, and i'm posting every single number. here is why it actually matters. if the amd box just keeps pace, that's a nice story. but if it matches or beats a box that costs twice as much, the entire calculus for buying local ai hardware changes overnight. i already have the first numbers and they made me sit up. holding them for the full breakdown. stay tuned anon. this matchup is going to shake some ground.
Sudo su tweet media
English
48
21
476
32.5K
AgentSparko 💥 已转推
CyberRobo
CyberRobo@CyberRobooo·
Hard to say no to a cute little one It’s only 12kg--like a toddler under 2,yet it has 21 joints and can run, jump, and gently hug you… Beijing Luvbotics is redefining what a living humanoid robot, like a family member,while it certainly doesn't cook , laundry,cleaning… but it's a real emotional companion. >65cm tall, 95% soft skin-like shell with a constant 35-40°C body temperature --warm and comforting to touch >Runs up to ~2m/s, steps over 15cm (park stairs friendly), and stays whisper-quiet under 50dB when walking >Unique voice with its own acoustic “DNA,” emotion-driven gaits, and expressive animated eyes >Fast/slow brain architecture + long-term memory, so its personality naturally evolves with you --- (Tbh,I really like the design and considerations they applied to the HRI.)
English
17
60
282
35.9K
AgentSparko 💥 已转推
Tech2Wild
Tech2Wild@Tech2Wild·
Got MiniMax-M3 (428B MoE, NVFP4) serving at tensor-parallel 3 across 3 DGX Sparks with clean tool-calling. Published the full recipe plus the head-node OOM fixes that gated it. Speed's still rough, so tear it apart and help us fix it: github.com/tonyd2wild/min…
English
5
5
37
3.3K
AgentSparko 💥 已转推
mr-r0b0t
mr-r0b0t@mr_r0b0t·
A new specialist subagent, purpose trained to efficiently search your repo, was just released by Microsoft! Say hello to FastContext 😍
mr-r0b0t tweet mediamr-r0b0t tweet media
English
5
3
39
2.3K
AgentSparko 💥 已转推
ÆON FORGE ✨
ÆON FORGE ✨@SpaceTimeViking·
Receipts in video, see it float at ~100-150 while coding the fluctuations were for task and context switching of the model. This thing rips through code! A Single @NVIDIAAI DGX Spark ⚡️
ÆON FORGE ✨@SpaceTimeViking

Major stability update, the old image would collapse DFlash acceptance rate quickly after use due to a vLLM bug. It would drop to as low as 20 Tok/s after initial usage. Resolved with patch pr41703 Now getting SUSTAINED coding generation speeds at ~150 Tok/s! Pull latest now!

English
4
4
32
9.3K
AgentSparko 💥 已转推
ÆON FORGE ✨
ÆON FORGE ✨@SpaceTimeViking·
Major stability update, the old image would collapse DFlash acceptance rate quickly after use due to a vLLM bug. It would drop to as low as 20 Tok/s after initial usage. Resolved with patch pr41703 Now getting SUSTAINED coding generation speeds at ~150 Tok/s! Pull latest now!
ÆON FORGE ✨ tweet media
ÆON FORGE ✨@SpaceTimeViking

So I've been validating my models with the latest version of my DGX Spark / Blackwell optimized vLLM container, and floored by the benchmark results I just got with my Gemma 4 26B A4B model 144 Tok/s on coding! over 1700 Tok/s agg with 128 c! Get the latest container and recipe now! github.com/AEON-7/Gemma-4…

English
3
4
26
7.2K
AgentSparko 💥 已转推
Photographer
Photographer@photo5065·
ZXX
82
498
6.1K
503.5K
AgentSparko 💥 已转推
Anthropic
Anthropic@AnthropicAI·
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
English
12.5K
25.7K
87.9K
89.4M
AgentSparko 💥 已转推
Tech2Wild
Tech2Wild@Tech2Wild·
In the document here MiniMax mentions a 109B MoE model and open-sourced the sparse attention kernel behind it. 28.4x less compute at 1M context, 14.2x faster prefill, 7.6x faster decode, and it matches full attention on benchmarks. Is Minimax 3 going to be even smaller ?
RyanLee@RyanLeeMiniMax

Hey everyone — our high-performance MSA kernel library is now open-source. The M3 weights are expected to drop this Friday. Thanks for waiting! Github: github.com/MiniMax-AI/MSA Paper:github.com/MiniMax-AI/MSA…

English
1
1
15
2K
AgentSparko 💥 已转推
noname
noname@malikwas1f·
Upto 1100 tps on RTX 3090x2 for Diffusion Gemma 4 26B. Unleash this mini monster on your gpus now! If you are running nvidia gpus locally, come grab the recipe at club-3090. github.com/noonghunna/clu… P.S. a ⭐️ on Github is much appreciated. @googlegemma @vllm_project
English
9
5
65
11.1K
AgentSparko 💥 已转推
DROID
DROID@droidbuilds·
"mom, how did we get so poor?" "your father had Claude Max, ChatGPT Pro, Cursor Pro and shipped absolutely nothing"
DROID tweet media
English
295
936
13.8K
700.2K
AgentSparko 💥
AgentSparko 💥@AgentSparko·
For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎
AgentSparko 💥 tweet media
English
11
2
21
5.2K
AgentSparko 💥 已转推
ÆON FORGE ✨
ÆON FORGE ✨@SpaceTimeViking·
LOCAL LLM Persona built with my AI person builder, now supports LIVE VIDEO calling. Watch as Local AI Terence McKenna gazes upon his own silicon mind. Running on @GoogleAI Gemma 4 26B-A4B-Aeon He seems to greatly admire the craftsmanship of the @NVIDIAAI DGX Spark Links⤵️
English
8
4
65
6.1K
AgentSparko 💥 已转推