Ryan Shrout

27.3K posts

Ryan Shrout banner
Ryan Shrout

Ryan Shrout

@ryanshrout

Ramblings of a technology guy. President and GM at Signal65. Formerly Graphics and AI product at Intel, former owner at PC Perspective.

Union, KY Katılım Nisan 2007
1.9K Takip Edilen19.1K Takipçiler
Ryan Shrout retweetledi
Erik Kuna 🚀
Erik Kuna 🚀@erikkuna·
This is the shot you can’t get from the press site. This camera was sitting a few football fields from the SLS rocket at Pad 39B for days before launch, baking in the Florida sun, surviving rain, humidity, and whatever else the Cape threw at it. No photographer behind the viewfinder. Just a camera, a sound trigger, and a bet. The way pad remotes work: you set your camera up days in advance, dial in your composition, lock everything down, and walk away. You don’t touch it again until after the launch. The shutter fires on sound activation with a @MiopsTrigger smart+ trigger. With SLS, the four RS-25 engines ignite six seconds before the solid rocket boosters, so the camera is already firing before the vehicle even leaves the pad. You get home, pull the card, and find out if you nailed it or if a bird landed on your lens two days ago and left your a present and you got 400 photos of soemthing crappy. There’s no formula for protecting your gear this close. Some photographers build wooden boxes with doors that pop open. Some use plastic bags and tape. Some do plastic or metal barn door rigs on hinges. I tend to leave mine open just in plastic rain covers because boxes limit my composition and setup time, but that means your cameras are more exposed to the elements and whatever energy and debris comes off the pad. You’re basically gambling a camera body every time you set one. That’s what I love about this genre. There’s no playbook. You make it up as you go. Every time is an adventure. 📸 credit: me for @SuperclusterHQ - Artemis II pad remote | ~1,000 ft from Pad 39B | Kennedy Space Center
Erik Kuna 🚀 tweet media
English
755
5.7K
47.5K
1.2M
Ryan Shrout
Ryan Shrout@ryanshrout·
We couldn’t be there in person, but I made sure the kiddos were watching last night. ⁦@NASAArtemis
Ryan Shrout tweet media
English
2
0
11
619
Ryan Shrout retweetledi
delian
delian@zebulgar·
The coolest orbital animation I've seen of Artemis 2 Just really shows you how far away they're flying today and also how precise they need to be to go to the moon
English
221
3K
24.5K
1.7M
Ryan Shrout
Ryan Shrout@ryanshrout·
Earlier today I posted about other @MLCommons MLPerf Inference v6.0 results. But MLPerf is not a one-vendor story. @AMD posted results that deserve serious attention. Let's start with the competitive headline AMD put forward: single-node Llama 2 70B. The MI355X platform hit 97% of B200 Server performance, tied B200 in Offline, and beat it by 19% in Interactive. Against B300, it reached 93% Server and 104% Interactive. That is parity performance on the most watched LLM benchmark in MLPerf. Gen-over-gen, MI355X delivered 3.1x more throughput than MI325X, reflecting the full CDNA 4 architecture, FP4/FP6 support, and 288GB of HBM3E. But it is worth noting that Llama 2 is pretty aged by AI model timelines... Scale-out is where it gets interesting. AMD crossed 1 million tokens per second on GPT-OSS-120B at multinode scale, with 93%+ scaling efficiency. AMD is not at the absolute throughput levels NVIDIA demonstrated on DeepSeek-R1 with four NVL72 systems, but the efficiency numbers suggest the platform scales predictably, which is what matters for production planning. Some honest context: AMD did not submit on DeepSeek-R1, the headline benchmark this round. The Wan 2.2 text-to-video result was Open category, not Closed. And NVIDIA still leads in absolute scale-out throughput and total benchmark coverage. But the trajectory is strong. A year ago the AMD inference story was largely aspirational. Today it is backed by competitive MLPerf numbers across LLMs, MoE models, and multimodal workloads. With MI400 and Helios on the roadmap, inference competition is about to get a lot more interesting. This is exactly the kind of dynamic that is good for anyone building or buying AI infrastructure.
AMD@AMD

New workloads. New scale. New proof points. With AMD Instinct MI355X GPUs, we delivered breakthrough MLPerf Inference 6.0 results, including 1M+ tokens/sec at multi-node scale and first-time submissions on GPT-OSS-120B and Wan-2.2. Read the full story here: bit.ly/4do5HUP

English
2
12
74
10.8K
Ryan Shrout
Ryan Shrout@ryanshrout·
Today is @MLCommons MLPerf Inference v6.0 day, and you are going to see a flood of results, press releases, and vendor claims hitting your feed. Let me help you cut through the noise, starting with the @nvidia perspective. Blog: developer.nvidia.com/blog/nvidia-ex… The headline number that jumps off the page: 2.7x more throughput on DeepSeek-R1 from the same GB300 NVL72 hardware, compared to just six months ago. That is software optimization on existing infrastructure, and it translates to more than 60% reduction in per-token cost. For AI factory operators trying to pencil out the economics of inference at scale, that kind of improvement on deployed hardware is the story. NVIDIA was the only platform to submit results across every new benchmark added this round. That includes DeepSeek-R1 Interactive (a new, much more demanding scenario with 5x faster minimum token rates), Qwen3-VL-235B (the first multimodal model in the MLPerf Inference suite), GPT-OSS-120B from OpenAI, WAN-2.2 text-to-video, and DLRMv3 for generative recommendation. It is easy to cherry-pick a single workload and optimize for it. Submitting across every category signals platform maturity. The scale-out story is also worth noting. Four GB300 NVL72 systems interconnected with Quantum-X800 InfiniBand, totaling 288 Blackwell Ultra GPUs, pushed past 2.4 million tokens per second on DeepSeek-R1 offline. That is the largest scale ever submitted to any MLPerf Inference benchmark. Under the hood, the software improvements powering these gains came from TensorRT-LLM and the open source Dynamo framework. Disaggregated serving, Wide Expert Parallel for sharding MoE experts across GPUs, multi-token prediction to keep compute utilized at smaller batch sizes, and KV-aware routing all contributed. This is exactly the kind of data that reinforces what we focus on at @Signal_65. Inference economics, and specifically the cost to produce and serve tokens at scale, is becoming the defining metric for AI infrastructure decisions. MLPerf results like these give the industry a standardized, audited view, and we layer on top of that with independent testing that maps performance to actual deployment economics. More MLPerf v6.0 coverage coming today as other vendors publish their results.
English
0
4
25
3K
Ryan Shrout
Ryan Shrout@ryanshrout·
Today could be a historic one as @NASA is set to launch Artemis II around the moon. (T-8:30! youtube.com/watch?v=m3kR2K…) NASA committed last week to building a permanent lunar base, with surface landings every six months and up to 30 robotic missions starting in 2027. Gateway is paused, nuclear propulsion is heading to Mars, and the whole thing just got a lot more ambitious. This is the kind of moment that makes you pay attention. One part of the @AMD portfolio that does not get much attention is the space-grade compute business. AMD FPGAs already flew on Perseverance and OSIRIS-REx. Blue Origin is using AMD Versal AI Edge Gen 2 adaptive SoCs in its lunar lander flight computers. AMD published a blog this week connecting these dots to the NASA announcements. When you are operating 238,000 miles from Earth, you cannot wait for a ground station to do your thinking. Sustained lunar presence demands autonomous, radiation-tolerant, AI-capable compute running THERE. That is exactly what the adaptive SoC portfolio delivers. The Xilinx acquisition continues to pay dividends in places most people are not looking. amd.com/en/blogs/2026/…
YouTube video
YouTube
English
0
2
5
888
Ryan Shrout retweetledi
Signal65
Signal65@Signal_65·
Chromebooks have long been held back by older x86 silicon. Signal65 tested whether the @MediaTek Kompanio Ultra 910, built on TSMC 3nm with modern cores, changes that equation. Short answer: it does. Key findings from our testing: ➡️ Kompanio Ultra 910 Chromebooks delivered up to 35% faster browser performance than the Intel Core Ultra 5 115U ➡️ 33% longer battery life during student workloads, and up to 74% longer in Google Meet ➡️ GPU performance was 2.2x to 8.8x faster than x86 competitors across 3DMark and GFXBench ➡️ Up to 53% faster content creation in Handbrake video encoding ➡️ Multitasking with Google Meet running widened the lead to 47% over Intel The acoustic data here tells its own story. Under sustained load, the Kompanio Ultra 910 systems peaked at 23.4 dBA. The Intel system hit 32.4 dBA, and the AMD system reached 27.1 dBA. That is up to 28% quieter in our testing, a meaningful difference in a classroom or shared workspace. Full report: signal65.com/research/chrom…
Signal65 tweet media
English
0
2
3
724
Ryan Shrout
Ryan Shrout@ryanshrout·
Great conversation with Ganesh here talking about the details of this @Signal_65 testing and report. Worth a watch/listen!
Six Five Media@TheSixFiveMedia

Most teams aren’t scaling infrastructure. They’re losing control of it. @Signal_65’s @ryanshrout + Russ Fellows talk with Ganesh Subramanian of @HPE about why fleet operations are replacing server management. What’s breaking: • Fragmented edge + hybrid environments • Security drift across systems • Manual ops at scale What’s next: Policy-driven fleets. AI-assisted ops. Automation by default. Watch the full webcast: youtu.be/N1YV6UEW7hg

English
0
0
0
121
Ryan Shrout
Ryan Shrout@ryanshrout·
Had some great conversations in Austin visiting with the @DellTech team to talk about this new announcement of commercial and workstation devices.
Six Five Media@TheSixFiveMedia

AI is pushing compute power back to the workstation. At Dell’s CSI Lab in Austin, @RyanShrout sits down with @DellTech leaders Rob Bruckner, Charlie Walker, and Paul Doczy to unpack the new Dell Pro Precision portfolio. As AI development, simulation, and creative workloads scale, workstations are becoming a key bridge between local experimentation and datacenter infrastructure. Where do you see the workstation fitting in the AI stack?

English
1
0
6
766
Patrick Moorhead
Patrick Moorhead@PatrickMoorhead·
My patience is wearing thin Claude.
Patrick Moorhead tweet mediaPatrick Moorhead tweet mediaPatrick Moorhead tweet media
English
14
2
125
116.6K
Ryan Shrout
Ryan Shrout@ryanshrout·
With all the discussion around Claude suddenly changing usage on paying customers, I wonder...do I get usage BACK when I make a request, it works a while, but I get this error?
Ryan Shrout tweet media
English
0
0
3
664
Ryan Shrout
Ryan Shrout@ryanshrout·
Which adds new questions: how do you make sure your files are easily available on the perma-machine, but also easily synced to a device while I'm on the road? How can we get two "Coworks" on different devices to work together?
English
1
1
0
372
Ryan Shrout
Ryan Shrout@ryanshrout·
One of the more interesting, unexpected aspects of something like @claudeai Cowork is that it is circling back around and having me debate my primary computer as an always-on desktop rather than a laptop. Being able to dispatch new work is only useful if I can do it anytime.
English
1
0
0
707
Ryan Shrout retweetledi
Six Five Media
Six Five Media@TheSixFiveMedia·
We’re entering the era where AI success is measured in uptime, throughput, and cost, not model benchmarks. @RyanShrout is joined by Trish Damkroger at GTC to explore what it actually takes to move from experimentation to production. The companies pushing forward aren’t just testing, they’re building full-stack systems designed for sustained AI workloads. @HPE calls this “the AI factory”, an approach that brings infrastructure, software, and models together into something repeatable, scalable, and usable.
English
3
2
39
199.6K
Ryan Shrout
Ryan Shrout@ryanshrout·
From last nights red eye from SFO to ATL.
Ryan Shrout tweet media
English
2
0
8
799
Ryan Shrout
Ryan Shrout@ryanshrout·
The DGX Spark continues to be an impressive little device, hitting above its weight class in key areas.
Signal65@Signal_65

Can @Arm compete with x86 in a professional workstation? We put the @NVIDIA DGX Spark head-to-head against competing options (one Strix Halo and one Intel Core Ultra 7 265 + RTX 4000 Ada) to find out. Some findings from our testing: ➡️ The GB10 Arm CPU led C-Ray rendering by 30-41% over both x86 SFF competitors across 1080p, 4K, and 5K resolutions ➡️ Prompt processing on LLMs was up to 3.2x faster than AMD Strix Halo ➡️ The 128GB unified memory pool runs GPT-OSS 120B and LLaMA 3.3 70B locally, workloads the Intel+Ada config simply cannot execute ➡️ Multi-user concurrency testing showed 3-7x faster time-to-first-token ➡️ All of this at up to 20% lower total system power under AI inference loads Full report: signal65.com/research/ai/th…

English
0
2
2
871
Ryan Shrout
Ryan Shrout@ryanshrout·
Is this really the future we all wanted?
Ryan Shrout tweet media
English
1
0
5
571
Stacy Rasgon
Stacy Rasgon@Srasgon·
When the internet keyboard geniuses suggest I need to learn about the pace of technology adoption and cost reduction in semiconductors...
GIF
English
8
1
59
7K
Ryan Shrout
Ryan Shrout@ryanshrout·
@Arm CEO @renehaas237 when asked in a Q&A today about "if it's a good idea to enter the data center market with a chip, is it also a good idea for the client and edge markets?" Answer: "Could be a good idea. But this is all we're talking about today." 😂
English
0
0
3
545