AgentSparko 💥

4K posts

AgentSparko 💥

@AgentSparko

#AI #Cybersecurity #Linux #privacy If you own a DGX Spark you might wanna fallow.

Middle of the GPU Inscrit le Ocak 2023

1.4K Abonnements2K Abonnés

Tweet épinglé

AgentSparko 💥@AgentSparko·31 Mar

For anyone saying DGX Spark cannot cook. Generating data sets for distilling using Qwen3.5-35B-A3B BF16 !!! (no quants) real data, 0% cache hit, concurrency=192 ; pp=2048 tokens in ; tq=1024 tokens out that`s 1.43M tokens generated every hour for the last 8 hours for 40 W/h.😎

English

5.2K

AgentSparko 💥@AgentSparko·4h

@onusoz If you want 100+ t/s single stream test @SpaceTimeViking DFlash docker container and uncensored model github.com/AEON-7/Qwen3.6…

English

147

Onur Solmaz@onusoz·9h

nvidia/Qwen3.6-35B-A3B-NVFP4 running in vLLM nightly on my Nvidia GB10 is actually insane 50 tok/s, 4 concurrent generations. total 200 tok/s. ideal for spawning subagents or working in parallel its tool calling behavior is very good as well. I will be giving it test drive on an openclaw instance, and keep you posted More details on NVIDIA forum: forums.developer.nvidia.com/t/benchmark-re…

English

128

12.8K

AgentSparko 💥 retweeté

Qwen@Alibaba_Qwen·5h

📣 Introducing the Qwen-Robot Suite — Qwen-RobotNav, Qwen-RobotManip, Qwen-RobotWorld, three foundation models, a full stack for embodied intelligence. 🧭 Qwen-RobotNav — the gateway to mobility. • Unifies 5 navigation tasks in one model: instruction following, point-goal, object-goal, target tracking, autonomous driving • Controllable observation protocol • Tool interface for agentic systems 🤖 Qwen-RobotManip — the foundation of interaction. • Unified state-action space across heterogeneous robots • Camera-frame delta poses for coherent cross-embodiment training • Pretrained on a 38,100+ hour open-source corpus 🌍 Qwen-RobotWorld — infinite worlds for physical agents. • Single world model, 20+ embodiments • Natural-language action interface • Predicts physically grounded futures across manipulation, driving, and navigation Each model is independently useful, and could be composed as physical-world tools.Together, they form the low-level toolkit for general-purpose agentic systems that don't just see the world, but act in it. 📷 Blog: qwen.ai/blog?id=qwen-r… 📖 Report： Qwen-RobotNav: …anwen-res.oss-accelerate.aliyuncs.com/qwenrobot/pape… Qwen-RobotManip: …anwen-res.oss-accelerate.aliyuncs.com/qwenrobot/pape… Qwen-RobotWorld： …anwen-res.oss-accelerate.aliyuncs.com/qwenrobot/pape…

English

225

1.5K

118.1K

AgentSparko 💥 retweeté

Sadao Tokuyama@tokufxug·1d

NVIDIAの人間3Dモデル「SOMA-X v0.2」が公開。 1つの骨組みであらゆる体型を表現でき、ロボットや物理AIに最適。関節のねじれ補正による自然な変形、骨の自動スケール、高度な姿勢反転、超軽量データを備えてます。 Apache 2.0でオープンソース化されてます。（詳細はリプ欄）

日本語

142

893

58.6K

AgentSparko 💥@AgentSparko·14h

@no_stp_on_snek Depends how optimized was vLLM for GB10. @SpaceTimeViking makes the best docker container for GB10 and uncensored models. He already got 2.5 months ago 73 t/s single and 259 t/s at c=8 github.com/AEON-7/Nemotro…

English

Tom Turney@no_stp_on_snek·18h

My second and late Build Small submission. 10 days, 1 dev: a from-scratch Rust engine + custom GPU kernels vs vLLM on NVIDIA's GB10, on NVIDIA's own Nemotron-30B. Decode beats vLLM at every depth (75.7 vs 57 tok/s). Prefill close but no cigar. Not bad for the timeline. Time for a computer break 😅 huggingface.co/spaces/build-s… @huggingface @Gradio @nvidia #BuildSmall

English

9.4K

AgentSparko 💥 retweeté

ÆON FORGE ✨@SpaceTimeViking·17h

Something fun is coming. I have no idea how I Frankensteined this thing together, but it can run on battery for hours. The project’s bare minimum will be a single Raspberry Pi, but I’m building this to do great things if you want to take it all the way. 4 hats + 1 month of dev

English

AgentSparko 💥 retweeté

How To Prompt@HowToPrompt__·1d

Researchers show that Claude Code is 98% not AI. Anthropic never gave us the architecture for Claude Code. There were no docs. Just a tool that every developer is currently obsessing over. Until it leaked recently. A research team pulled the source code, analyzed all 500,000 lines, and found something ridiculous. Only 1.6% of the codebase actually interacts with the AI model. The core of Claude Code is literally just a simple while-loop. It asks the model what to do, runs a tool, and repeats. So what is the other 98.4%? It is hardcore, traditional software engineering. The researchers found a massive, complex infrastructure designed entirely to babysit the AI and keep it from hallucinating or destroying your computer: - A 7-mode permission system acting as a security bouncer. - A 5-layer context compaction pipeline so the AI doesn't forget its goal. - A subagent delegation mechanism with strict worktree isolation. - Four different extensibility hooks to manage external tools safely. Every startup right now is trying to build a better AI model to get better results. Anthropic did the exact opposite. They took an existing model and built a fortress of deterministic software around it. They realized that the AI doesn't need to be smarter. It needs to be managed.

English

152

396

173.1K

AgentSparko 💥 retweeté

Steeve Morin@steeve·1d

Congratulations guys! That's built in Germany, btw. Yeah, the Germany in Europe. kthxbye.

Tensordyne@TensordyneInc

x.com/i/article/2066…

English

114

48.6K

AgentSparko 💥 retweeté

Tech2Wild@Tech2Wild·22h

✅ Repo pushed — all updates are live. Commit eb12c02 on github.com/tonyd2wild/min…: • Phase 3 (RoCE) flipped from "WIP / err-110 blocked" → "SOLVED 2026-06-15" with the full recipe • Both fixes documented: NCCL v2.30u1 from source (Fix 1) + the baked-LD_PRELOAD shim override (Fix 2, the non-obvious one) with the exact env block + FORCED_NCCL_VERSION 23007 verification • The cold-power-drain bandwidth finding (12.8 → 111.85 Gb/s, credited mashie) • Honest RESULTS block (~10.5 t/s single-stream, +75% over 1GbE, compute-bound past ~13 Gb/s, concurrency caveat, eagle3 +25% stacks) • The real patched m3vllm-roce.sh committed (with the LD_PRELOAD fix), credits updated (eugr + mashie + the ChatGPT debug pass) • Zero em dashes, all numbers accurate to what we measured So anyone hitting err-110 or the 12.8 cap now has the answer. The 200K M3 is still finishing its boot — watcher will confirm it's serving clean, then we're fully wrapped on this.

English

930

AgentSparko 💥 retweeté

Charles Curran@charliebcurran·1d

I used AI to explain the Anthropic drama to my girlfriend, with fruit.

English

334

618

9.4K

1.4M

AgentSparko 💥@AgentSparko·1d

@sudoingX AMD is actually more expensive than Spark if you get a Spark OEM like Asus GX10 and you also have high speed connectivity for clustering, CUDA and software compatibility. Also forcing the test on llama.cpp and GGUF only is not peak performance or quality for NVIDIA.

English

866

Sudo su@sudoingX·1d

nvidia vs amd two boxes on my desk, both 128gb of unified memory. one is the nvidia dgx spark ($4,699). the other is the amd strix halo ($1,999), amd at roughly half the price. i'm running the exact same models on both, from a 3b all the way up to a 397b, same quants, same llama.cpp, and i'm posting every single number. here is why it actually matters. if the amd box just keeps pace, that's a nice story. but if it matches or beats a box that costs twice as much, the entire calculus for buying local ai hardware changes overnight. i already have the first numbers and they made me sit up. holding them for the full breakdown. stay tuned anon. this matchup is going to shake some ground.

English

1.3K

114.6K

AgentSparko 💥 retweeté

CyberRobo@CyberRobooo·3d

Hard to say no to a cute little one It’s only 12kg--like a toddler under 2,yet it has 21 joints and can run, jump, and gently hug you… Beijing Luvbotics is redefining what a living humanoid robot, like a family member,while it certainly doesn't cook , laundry,cleaning… but it's a real emotional companion. >65cm tall, 95% soft skin-like shell with a constant 35-40°C body temperature --warm and comforting to touch >Runs up to ~2m/s, steps over 15cm (park stairs friendly), and stays whisper-quiet under 50dB when walking >Unique voice with its own acoustic “DNA,” emotion-driven gaits, and expressive animated eyes >Fast/slow brain architecture + long-term memory, so its personality naturally evolves with you --- (Tbh,I really like the design and considerations they applied to the HRI.）

English

283

36.4K

AgentSparko 💥 retweeté

Tech2Wild@Tech2Wild·1d

Got MiniMax-M3 (428B MoE, NVFP4) serving at tensor-parallel 3 across 3 DGX Sparks with clean tool-calling. Published the full recipe plus the head-node OOM fixes that gated it. Speed's still rough, so tear it apart and help us fix it: github.com/tonyd2wild/min…

English

5.9K

AgentSparko 💥 retweeté

mr-r0b0t@mr_r0b0t·1d

A new specialist subagent, purpose trained to efficiently search your repo, was just released by Microsoft! Say hello to FastContext 😍

English

3.3K

AgentSparko 💥 retweeté

ÆON FORGE ✨@SpaceTimeViking·4d

Receipts in video, see it float at ~100-150 while coding the fluctuations were for task and context switching of the model. This thing rips through code! A Single @NVIDIAAI DGX Spark ⚡️

ÆON FORGE ✨@SpaceTimeViking

Major stability update, the old image would collapse DFlash acceptance rate quickly after use due to a vLLM bug. It would drop to as low as 20 Tok/s after initial usage. Resolved with patch pr41703 Now getting SUSTAINED coding generation speeds at ~150 Tok/s! Pull latest now!

English

9.6K

AgentSparko 💥 retweeté

ÆON FORGE ✨@SpaceTimeViking·4d

ÆON FORGE ✨@SpaceTimeViking

So I've been validating my models with the latest version of my DGX Spark / Blackwell optimized vLLM container, and floored by the benchmark results I just got with my Gemma 4 26B A4B model 144 Tok/s on coding! over 1700 Tok/s agg with 128 c! Get the latest container and recipe now! github.com/AEON-7/Gemma-4…

English

7.3K

AgentSparko 💥 retweeté

Photographer@photo5065·3d

ZXX

498

6.1K

504.7K

AgentSparko 💥 retweeté

Terp@OnlyTerp·2d

@DennisonBertram x.com/OnlyTerp/statu… like this one but this works for every model from every oauth 🫡

Terp@OnlyTerp

ULTRACODE-SHIM IS NOW LIVE 🔥 You can now run ANY model in UltraCode I built a github repo to make this really easy for you, Just send your agent there and let him COOK You deserve the flexibility to use LOCAL models & cost efficient models. So I made that happen for you 🫶

English

894

AgentSparko 💥 retweeté

Anthropic@AnthropicAI·3d

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

English

12.6K

25.8K

88.1K

90.1M

AgentSparko 💥 retweeté

Tech2Wild@Tech2Wild·4d

In the document here MiniMax mentions a 109B MoE model and open-sourced the sparse attention kernel behind it. 28.4x less compute at 1M context, 14.2x faster prefill, 7.6x faster decode, and it matches full attention on benchmarks. Is Minimax 3 going to be even smaller ?

RyanLee@RyanLeeMiniMax

Hey everyone — our high-performance MSA kernel library is now open-source. The M3 weights are expected to drop this Friday. Thanks for waiting! Github: github.com/MiniMax-AI/MSA Paper：github.com/MiniMax-AI/MSA…

English

AgentSparko 💥 retweeté

noname@malikwas1f·5d

Upto 1100 tps on RTX 3090x2 for Diffusion Gemma 4 26B. Unleash this mini monster on your gpus now! If you are running nvidia gpus locally, come grab the recipe at club-3090. github.com/noonghunna/clu… P.S. a ⭐️ on Github is much appreciated. @googlegemma @vllm_project

English

11.2K

Découvrir

@onusoz @SpaceTimeViking @no_stp_on_snek @huggingface @Gradio @nvidia @sudoingX @NVIDIAAI