Ryan Shrout

27.3K posts

Ryan Shrout

@ryanshrout

Ramblings of a technology guy. President and GM at Signal65. Formerly Graphics and AI product at Intel, former owner at PC Perspective.

Union, KY Katılım Nisan 2007

1.9K Takip Edilen19.1K Takipçiler

Ryan Shrout retweetledi

Erik Kuna 🚀@erikkuna·2d

This is the shot you can’t get from the press site. This camera was sitting a few football fields from the SLS rocket at Pad 39B for days before launch, baking in the Florida sun, surviving rain, humidity, and whatever else the Cape threw at it. No photographer behind the viewfinder. Just a camera, a sound trigger, and a bet. The way pad remotes work: you set your camera up days in advance, dial in your composition, lock everything down, and walk away. You don’t touch it again until after the launch. The shutter fires on sound activation with a @MiopsTrigger smart+ trigger. With SLS, the four RS-25 engines ignite six seconds before the solid rocket boosters, so the camera is already firing before the vehicle even leaves the pad. You get home, pull the card, and find out if you nailed it or if a bird landed on your lens two days ago and left your a present and you got 400 photos of soemthing crappy. There’s no formula for protecting your gear this close. Some photographers build wooden boxes with doors that pop open. Some use plastic bags and tape. Some do plastic or metal barn door rigs on hinges. I tend to leave mine open just in plastic rain covers because boxes limit my composition and setup time, but that means your cameras are more exposed to the elements and whatever energy and debris comes off the pad. You’re basically gambling a camera body every time you set one. That’s what I love about this genre. There’s no playbook. You make it up as you go. Every time is an adventure. 📸 credit: me for @SuperclusterHQ - Artemis II pad remote | ~1,000 ft from Pad 39B | Kennedy Space Center

English

755

5.7K

47.5K

1.2M

Ryan Shrout@ryanshrout·2d

We couldn’t be there in person, but I made sure the kiddos were watching last night. ⁦@NASAArtemis⁩

English

619

Ryan Shrout retweetledi

delian@zebulgar·3d

The coolest orbital animation I've seen of Artemis 2 Just really shows you how far away they're flying today and also how precise they need to be to go to the moon

English

221

24.5K

1.7M

Ryan Shrout@ryanshrout·3d

Earlier today I posted about other @MLCommons MLPerf Inference v6.0 results. But MLPerf is not a one-vendor story. @AMD posted results that deserve serious attention. Let's start with the competitive headline AMD put forward: single-node Llama 2 70B. The MI355X platform hit 97% of B200 Server performance, tied B200 in Offline, and beat it by 19% in Interactive. Against B300, it reached 93% Server and 104% Interactive. That is parity performance on the most watched LLM benchmark in MLPerf. Gen-over-gen, MI355X delivered 3.1x more throughput than MI325X, reflecting the full CDNA 4 architecture, FP4/FP6 support, and 288GB of HBM3E. But it is worth noting that Llama 2 is pretty aged by AI model timelines... Scale-out is where it gets interesting. AMD crossed 1 million tokens per second on GPT-OSS-120B at multinode scale, with 93%+ scaling efficiency. AMD is not at the absolute throughput levels NVIDIA demonstrated on DeepSeek-R1 with four NVL72 systems, but the efficiency numbers suggest the platform scales predictably, which is what matters for production planning. Some honest context: AMD did not submit on DeepSeek-R1, the headline benchmark this round. The Wan 2.2 text-to-video result was Open category, not Closed. And NVIDIA still leads in absolute scale-out throughput and total benchmark coverage. But the trajectory is strong. A year ago the AMD inference story was largely aspirational. Today it is backed by competitive MLPerf numbers across LLMs, MoE models, and multimodal workloads. With MI400 and Helios on the roadmap, inference competition is about to get a lot more interesting. This is exactly the kind of dynamic that is good for anyone building or buying AI infrastructure.

AMD@AMD

New workloads. New scale. New proof points. With AMD Instinct MI355X GPUs, we delivered breakthrough MLPerf Inference 6.0 results, including 1M+ tokens/sec at multi-node scale and first-time submissions on GPT-OSS-120B and Wan-2.2. Read the full story here: bit.ly/4do5HUP

English

10.8K

Ryan Shrout@ryanshrout·3d

Today is @MLCommons MLPerf Inference v6.0 day, and you are going to see a flood of results, press releases, and vendor claims hitting your feed. Let me help you cut through the noise, starting with the @nvidia perspective. Blog: developer.nvidia.com/blog/nvidia-ex… The headline number that jumps off the page: 2.7x more throughput on DeepSeek-R1 from the same GB300 NVL72 hardware, compared to just six months ago. That is software optimization on existing infrastructure, and it translates to more than 60% reduction in per-token cost. For AI factory operators trying to pencil out the economics of inference at scale, that kind of improvement on deployed hardware is the story. NVIDIA was the only platform to submit results across every new benchmark added this round. That includes DeepSeek-R1 Interactive (a new, much more demanding scenario with 5x faster minimum token rates), Qwen3-VL-235B (the first multimodal model in the MLPerf Inference suite), GPT-OSS-120B from OpenAI, WAN-2.2 text-to-video, and DLRMv3 for generative recommendation. It is easy to cherry-pick a single workload and optimize for it. Submitting across every category signals platform maturity. The scale-out story is also worth noting. Four GB300 NVL72 systems interconnected with Quantum-X800 InfiniBand, totaling 288 Blackwell Ultra GPUs, pushed past 2.4 million tokens per second on DeepSeek-R1 offline. That is the largest scale ever submitted to any MLPerf Inference benchmark. Under the hood, the software improvements powering these gains came from TensorRT-LLM and the open source Dynamo framework. Disaggregated serving, Wide Expert Parallel for sharding MoE experts across GPUs, multi-token prediction to keep compute utilized at smaller batch sizes, and KV-aware routing all contributed. This is exactly the kind of data that reinforces what we focus on at @Signal_65. Inference economics, and specifically the cost to produce and serve tokens at scale, is becoming the defining metric for AI infrastructure decisions. MLPerf results like these give the industry a standardized, audited view, and we layer on top of that with independent testing that maps performance to actual deployment economics. More MLPerf v6.0 coverage coming today as other vendors publish their results.

English

Ryan Shrout@ryanshrout·3d

Today could be a historic one as @NASA is set to launch Artemis II around the moon. (T-8:30! youtube.com/watch?v=m3kR2K…) NASA committed last week to building a permanent lunar base, with surface landings every six months and up to 30 robotic missions starting in 2027. Gateway is paused, nuclear propulsion is heading to Mars, and the whole thing just got a lot more ambitious. This is the kind of moment that makes you pay attention. One part of the @AMD portfolio that does not get much attention is the space-grade compute business. AMD FPGAs already flew on Perseverance and OSIRIS-REx. Blue Origin is using AMD Versal AI Edge Gen 2 adaptive SoCs in its lunar lander flight computers. AMD published a blog this week connecting these dots to the NASA announcements. When you are operating 238,000 miles from Earth, you cannot wait for a ground station to do your thinking. Sustained lunar presence demands autonomous, radiation-tolerant, AI-capable compute running THERE. That is exactly what the adaptive SoC portfolio delivers. The Xilinx acquisition continues to pay dividends in places most people are not looking. amd.com/en/blogs/2026/…

YouTube

English

888

Ryan Shrout retweetledi

Signal65@Signal_65·4d

Chromebooks have long been held back by older x86 silicon. Signal65 tested whether the @MediaTek Kompanio Ultra 910, built on TSMC 3nm with modern cores, changes that equation. Short answer: it does. Key findings from our testing: ➡️ Kompanio Ultra 910 Chromebooks delivered up to 35% faster browser performance than the Intel Core Ultra 5 115U ➡️ 33% longer battery life during student workloads, and up to 74% longer in Google Meet ➡️ GPU performance was 2.2x to 8.8x faster than x86 competitors across 3DMark and GFXBench ➡️ Up to 53% faster content creation in Handbrake video encoding ➡️ Multitasking with Google Meet running widened the lead to 47% over Intel The acoustic data here tells its own story. Under sustained load, the Kompanio Ultra 910 systems peaked at 23.4 dBA. The Intel system hit 32.4 dBA, and the AMD system reached 27.1 dBA. That is up to 28% quieter in our testing, a meaningful difference in a classroom or shared workspace. Full report: signal65.com/research/chrom…

English

724

Ryan Shrout@ryanshrout·4d

Great conversation with Ganesh here talking about the details of this @Signal_65 testing and report. Worth a watch/listen!

Six Five Media@TheSixFiveMedia

Most teams aren’t scaling infrastructure. They’re losing control of it. @Signal_65’s @ryanshrout + Russ Fellows talk with Ganesh Subramanian of @HPE about why fleet operations are replacing server management. What’s breaking: • Fragmented edge + hybrid environments • Security drift across systems • Manual ops at scale What’s next: Policy-driven fleets. AI-assisted ops. Automation by default. Watch the full webcast: youtu.be/N1YV6UEW7hg

English

121

Ryan Shrout@ryanshrout·27 Mar

Had some great conversations in Austin visiting with the @DellTech team to talk about this new announcement of commercial and workstation devices.

Six Five Media@TheSixFiveMedia

AI is pushing compute power back to the workstation. At Dell’s CSI Lab in Austin, @RyanShrout sits down with @DellTech leaders Rob Bruckner, Charlie Walker, and Paul Doczy to unpack the new Dell Pro Precision portfolio. As AI development, simulation, and creative workloads scale, workstations are becoming a key bridge between local experimentation and datacenter infrastructure. Where do you see the workstation fitting in the AI stack?

English

766

Ryan Shrout@ryanshrout·27 Mar

@PatrickMoorhead Oof

477

Patrick Moorhead@PatrickMoorhead·27 Mar

My patience is wearing thin Claude.

English

125

116.6K

Ryan Shrout@ryanshrout·27 Mar

With all the discussion around Claude suddenly changing usage on paying customers, I wonder...do I get usage BACK when I make a request, it works a while, but I get this error?

English

664

Ryan Shrout@ryanshrout·27 Mar

Which adds new questions: how do you make sure your files are easily available on the perma-machine, but also easily synced to a device while I'm on the road? How can we get two "Coworks" on different devices to work together?

English

372

Ryan Shrout@ryanshrout·27 Mar

One of the more interesting, unexpected aspects of something like @claudeai Cowork is that it is circling back around and having me debate my primary computer as an always-on desktop rather than a laptop. Being able to dispatch new work is only useful if I can do it anytime.

English

707

Ryan Shrout retweetledi

Six Five Media@TheSixFiveMedia·27 Mar

We’re entering the era where AI success is measured in uptime, throughput, and cost, not model benchmarks. @RyanShrout is joined by Trish Damkroger at GTC to explore what it actually takes to move from experimentation to production. The companies pushing forward aren’t just testing, they’re building full-stack systems designed for sustained AI workloads. @HPE calls this “the AI factory”, an approach that brings infrastructure, software, and models together into something repeatable, scalable, and usable.

English

199.6K

Ryan Shrout@ryanshrout·26 Mar

From last nights red eye from SFO to ATL.

English

799

Ryan Shrout@ryanshrout·26 Mar

Okay, I admit, I’ve been waiting a while for this…

Felix Rieseberg@felixrieseberg

Claude Cowork is now available on Windows on Arm! If you have a computer with a Snapdragon or other Arm-based chip, grab the latest version at claude.com/download.

English

943

Ryan Shrout@ryanshrout·25 Mar

The DGX Spark continues to be an impressive little device, hitting above its weight class in key areas.

Signal65@Signal_65

Can @Arm compete with x86 in a professional workstation? We put the @NVIDIA DGX Spark head-to-head against competing options (one Strix Halo and one Intel Core Ultra 7 265 + RTX 4000 Ada) to find out. Some findings from our testing: ➡️ The GB10 Arm CPU led C-Ray rendering by 30-41% over both x86 SFF competitors across 1080p, 4K, and 5K resolutions ➡️ Prompt processing on LLMs was up to 3.2x faster than AMD Strix Halo ➡️ The 128GB unified memory pool runs GPT-OSS 120B and LLaMA 3.3 70B locally, workloads the Intel+Ada config simply cannot execute ➡️ Multi-user concurrency testing showed 3-7x faster time-to-first-token ➡️ All of this at up to 20% lower total system power under AI inference loads Full report: signal65.com/research/ai/th…

English

871

Ryan Shrout@ryanshrout·25 Mar

Is this really the future we all wanted?

English

571

Ryan Shrout@ryanshrout·25 Mar

@Srasgon Get on it!

English

277

Stacy Rasgon@Srasgon·25 Mar

When the internet keyboard geniuses suggest I need to learn about the pace of technology adoption and cost reduction in semiconductors...

GIF

English

Ryan Shrout@ryanshrout·25 Mar

@Arm CEO @renehaas237 when asked in a Q&A today about "if it's a good idea to enter the data center market with a chip, is it also a good idea for the client and edge markets?" Answer: "Could be a good idea. But this is all we're talking about today." 😂

English

545

Keşfet

@MiopsTrigger @SuperclusterHQ @NASAArtemis @MLCommons @AMD @nvidia @Signal_65 @NASA