Matt Elliott

2.7K posts

Matt Elliott

@NetworkBrouhaha

TME at @AMD | CCIE #56011 | AI/ML/DevOps, Network Design/Architecture, Automation, Cloudy Stuff. Available to sit on your Board.

Lexington, KY Beigetreten Ocak 2018

530 Folgt629 Follower

Matt Elliott@NetworkBrouhaha·4d

@MacroEngineered I’ve been on board with this approach for a while :D github.com/shamsway/octant. Eventually I will get around to sharing all of the stuff I haven’t pushed to the public repo… There may be a branch with some fun stuff buried within. Definitely written a _lot_ of HCL with LLMs!

English

William Collins@MacroEngineered·4d

wcollins.io/posts/2026/ai-…

ZXX

Matt Elliott retweetet

AI at AMD@AIatAMD·29 Nis

OpenClaw + open-source models + AMD GPUs = a very fun workshop Shoutout to Eda Zhou and Mahdi Ghodsi for the deep dive (and the mini lobsters 🦞) Good vibes all around at @DeepLearningAI AI Dev '26!

English

1.3K

Matt Elliott@NetworkBrouhaha·27 Nis

@SharpNetwork The “weather predicting rock” is also a favorite. Agreed on the hot brown!

English

Eyvonne Sharp@SharpNetwork·25 Nis

At my husband’s favorite restaurant, Ramsey’s in Lexington, KY. Also, they make the best hot brown in the state.

English

180

Matt Elliott@NetworkBrouhaha·15 Nis

@SharpNetwork

GIF

QME

Eyvonne Sharp@SharpNetwork·15 Nis

👀

Isi Breen@isaiah_bb

American men are more likely to have gambling debt than to have read a book in the last year.

ART

153

Matt Elliott retweetet

Phil Gervasi@network_phil·15 Nis

[NEW BLOG] Rail-Optimized Networking for AI Training Workloads networkphil.com/2026/04/15/rai…

English

414

Matt Elliott retweetet

Sharon Zhou@realSharonZhou·23 Mar

codegen is cheap now. performance isn’t: most generated kernels are kinda mid. iteration and feedback are missing for both the agent and RL env layers. so we're open-sourcing Apex: an end-to-end agent using Claude Code + Codex to effectively optimize AMD kernels, instead of one-shotting them github.com/AMD-AGI/Apex

English

190

90.7K

Matt Elliott@NetworkBrouhaha·3 Nis

@realSharonZhou @LisaSu

GIF

QME

Sharon Zhou@realSharonZhou·2 Nis

Happy to share I’m expanding my role to report directly to @LisaSu!

English

721

109.8K

Matt Elliott retweetet

Emad Barsoum@EmadBarsoumPi·3 Nis

Gemma 4 Day 0 support!!! @AIatAMD amd.com/en/developer/r…

English

5.2K

Matt Elliott retweetet

Sally Ward-Foxton@sallywf·1 Nis

As the person in charge of ROCm, @AMD's answer to CUDA, @AnushElangovan is singularly critical to the challenger taking any share from the market leader. Impressive, then, that he still answers every anonymous "ROCm sucks" tweet personally. Interview: eetimes.com/taking-on-cuda…

English

9.5K

Matt Elliott retweetet

SemiAnalysis@SemiAnalysis_·27 Mar

18x IMPROVEMENT ALERT🚀 In under 30 days, AMD was able to improvement Kimi K2.5 1T MXFP4 interactivity by up to 18x when iso-throughput. The main changes are in PR number 35850 AMD fixed their vLLM AITER integration to support the Kimi K2.5 MLA which uses num_head=8 for TP8 & num_head=16 for TP4 along with general GEMM tuning. All of these bug fixes & perf tuning are upstreamed & already in the vLLM 0.18 release. Great work to Chuan Li & @AnushElangovan Speed is the Moat 🔥

English

324

40.6K

Matt Elliott retweetet

Anush Elangovan@AnushElangovan·24 Mar

End to End RL on AMD MI355x. Great partnership with @lmsysorg and @radixark

LMSYS Org@lmsysorg

🚀 New blog: ROCm Support for Miles: Large-Scale RL Post-Training on AMD Instinct™ GPUs Together with @AMD, Miles brings end-to-end RL pipelines to MI300/350-class clusters: ⚡️ Rollout generation dominates RL compute, and AMD’s HBM bandwidth directly addresses this bottleneck 🧠 AIME accuracy improved from 0.665 → 0.729 across training on Qwen3-30B-A3B with GRPO 💾 MI300X delivers ~1.1–1.3k tok/GPU/s rollout throughput ⏱️ Mean step time 388.5s on a single 8-GPU MI300X node (32×8 sampling, 8k response cap) 🔧 Multi-turn agentic training validated ... and more optimizations to come 🔥

English

7.4K

Matt Elliott retweetet

Phoronix@phoronix·4 Mar

AMD Engineer Leverages AI To Help Make A Pure-Python AMD GPU User-Space Driver Python user-space AMD GPU driver written in part by AI.... What?!?! phoronix.com/news/AI-Pure-P…

English

104

7.6K

Matt Elliott retweetet

SemiAnalysis@SemiAnalysis_·4 Mar

Due to optimizations in AMD's MoRI inference communication library & better kernels AMD performance has 1.5x in the span of 30 days. These optimizations for MoE dispatch, MoE combine, kvcache transfer comms have been upstreamed in SGLang for everyone to use in PR 17012, 14626, 18437 etc. MoRI is AMD's inference comms library built from first principles by their 10x china team! In the age of inference, developer velocity of inference optimizations matters a lot & InferenceX™ is our research platform to continuously track the performance.

English

176

15.7K

Matt Elliott retweetet

Anush Elangovan@AnushElangovan·4 Mar

Inspired by @__tinygrad__ userspace AMD driver, I clauded a userspace driver for some stress testing of SDMA and compute/comms overlap debug. I didn't open the editor once. Agents are the great equalizer in software. And Speed is the moat. github.com/ROCm/TheRock/t…

English

203

12.2K

Matt Elliott retweetet

the tiny corp@__tinygrad__·3 Mar

And now it's documented. For people serious about performance, this is a real AMD advantage. Thanks @AnushElangovan for pushing this through! Code is here: github.com/ROCm/rocm-syst…

English

365

33.9K

Matt Elliott retweetet

the tiny corp@__tinygrad__·3 Mar

AMD open sourced rocprof-trace-decoder! This was one of the last pieces of closed source code on the CPU side -- the definitions of the hardware SQTT traces are now public. AMD's tracing infrastructure is better than NVIDIA's, it can trace the timing of every instruction.

English

1.2K

52.1K

Matt Elliott retweetet

Anush Elangovan@AnushElangovan·27 Şub

Beyond Porting: How vLLM Orchestrates High-Performance Inference on AMD ROCm blog.vllm.ai/2026/02/27/roc…

English

3.6K

Matt Elliott retweetet

Meta Newsroom@MetaNewsroom·24 Şub

Today, we’re announcing a long-term agreement with @AMD to power our AI infrastructure with up to 6GW of AMD Instinct GPUs, helping us build a flexible, resilient tech stack for our AI workloads. about.fb.com/news/2026/02/m…

English

281

10.2K

Matt Elliott retweetet

Anush Elangovan@AnushElangovan·21 Şub

Open Claw on your Strix Halo. Great work.

Maximilian Messing@mmessing

My AI agent drafted this post while I slept. OpenClaw. Running locally on my Mini PC. Woke up at 7:30 AM to: → 5 curated news stories (filtered to exactly what matters for my work) → A LinkedIn post draft ready to review → A suggested workout based on my last session → 3 open source model updates worth checking All delivered as a PDF. Before my first coffee. The setup: → AMD Strix Halo Mini PC → €0/month cloud costs → One agent running locally while I sleep This is what I mean when I say personal AI computing is the shift. Not ChatGPT on demand. An agent that knows your priorities, watches your world, and shows up with work done. What's your morning routine missing?

English

6.3K

Matt Elliott@NetworkBrouhaha·16 Ara

@SharpNetwork Well that’s lovely. Merry Christmas to the Sharp fam!

English

Eyvonne Sharp@SharpNetwork·16 Ara

Hard day. So I set my table with my yard sale Christmas dishes I got for $35.

English

438

Entdecken

@MacroEngineered @DeepLearningAI @SharpNetwork @realSharonZhou @LisaSu @AIatAMD @AMD @AnushElangovan