Matt Elliott

2.7K posts

Matt Elliott banner
Matt Elliott

Matt Elliott

@NetworkBrouhaha

TME at @AMD | CCIE #56011 | AI/ML/DevOps, Network Design/Architecture, Automation, Cloudy Stuff. Available to sit on your Board.

Lexington, KY Beigetreten Ocak 2018
530 Folgt629 Follower
Matt Elliott
Matt Elliott@NetworkBrouhaha·
@MacroEngineered I’ve been on board with this approach for a while :D github.com/shamsway/octant. Eventually I will get around to sharing all of the stuff I haven’t pushed to the public repo… There may be a branch with some fun stuff buried within. Definitely written a _lot_ of HCL with LLMs!
English
0
0
1
27
Matt Elliott retweetet
AI at AMD
AI at AMD@AIatAMD·
OpenClaw + open-source models + AMD GPUs = a very fun workshop Shoutout to Eda Zhou and Mahdi Ghodsi for the deep dive (and the mini lobsters 🦞) Good vibes all around at @DeepLearningAI AI Dev '26!
AI at AMD tweet mediaAI at AMD tweet media
English
2
6
27
1.3K
Matt Elliott
Matt Elliott@NetworkBrouhaha·
@SharpNetwork The “weather predicting rock” is also a favorite. Agreed on the hot brown!
English
0
0
1
20
Eyvonne Sharp
Eyvonne Sharp@SharpNetwork·
At my husband’s favorite restaurant, Ramsey’s in Lexington, KY. Also, they make the best hot brown in the state.
Eyvonne Sharp tweet media
English
1
0
5
180
Matt Elliott retweetet
Sharon Zhou
Sharon Zhou@realSharonZhou·
codegen is cheap now. performance isn’t: most generated kernels are kinda mid. iteration and feedback are missing for both the agent and RL env layers. so we're open-sourcing Apex: an end-to-end agent using Claude Code + Codex to effectively optimize AMD kernels, instead of one-shotting them github.com/AMD-AGI/Apex
English
7
21
190
90.7K
Sharon Zhou
Sharon Zhou@realSharonZhou·
Happy to share I’m expanding my role to report directly to @LisaSu!
Sharon Zhou tweet media
English
57
23
721
109.8K
Matt Elliott retweetet
Sally Ward-Foxton
Sally Ward-Foxton@sallywf·
As the person in charge of ROCm, @AMD's answer to CUDA, @AnushElangovan is singularly critical to the challenger taking any share from the market leader. Impressive, then, that he still answers every anonymous "ROCm sucks" tweet personally. Interview: eetimes.com/taking-on-cuda…
English
5
11
97
9.5K
Matt Elliott retweetet
SemiAnalysis
SemiAnalysis@SemiAnalysis_·
18x IMPROVEMENT ALERT🚀 In under 30 days, AMD was able to improvement Kimi K2.5 1T MXFP4 interactivity by up to 18x when iso-throughput.  The main changes are in PR number 35850 AMD fixed their vLLM AITER integration to support the Kimi K2.5 MLA which uses num_head=8 for TP8 & num_head=16 for TP4 along with general GEMM tuning. All of these bug fixes & perf tuning are upstreamed & already in the vLLM 0.18 release.  Great work to Chuan Li & @AnushElangovan  Speed is the Moat 🔥
SemiAnalysis tweet media
English
3
36
324
40.6K
Matt Elliott retweetet
Anush Elangovan
Anush Elangovan@AnushElangovan·
End to End RL on AMD MI355x. Great partnership with @lmsysorg and @radixark
LMSYS Org@lmsysorg

🚀 New blog: ROCm Support for Miles: Large-Scale RL Post-Training on AMD Instinct™ GPUs Together with @AMD, Miles brings end-to-end RL pipelines to MI300/350-class clusters: ⚡️ Rollout generation dominates RL compute, and AMD’s HBM bandwidth directly addresses this bottleneck 🧠 AIME accuracy improved from 0.665 → 0.729 across training on Qwen3-30B-A3B with GRPO 💾 MI300X delivers ~1.1–1.3k tok/GPU/s rollout throughput ⏱️ Mean step time 388.5s on a single 8-GPU MI300X node (32×8 sampling, 8k response cap) 🔧 Multi-turn agentic training validated ... and more optimizations to come 🔥

English
4
8
91
7.4K
Matt Elliott retweetet
Phoronix
Phoronix@phoronix·
AMD Engineer Leverages AI To Help Make A Pure-Python AMD GPU User-Space Driver Python user-space AMD GPU driver written in part by AI.... What?!?! phoronix.com/news/AI-Pure-P…
English
5
19
104
7.6K
Matt Elliott retweetet
SemiAnalysis
SemiAnalysis@SemiAnalysis_·
Due to optimizations in AMD's MoRI inference communication library & better kernels AMD performance has 1.5x in the span of 30 days. These optimizations for MoE dispatch, MoE combine, kvcache transfer comms have been upstreamed in SGLang for everyone to use in PR 17012, 14626, 18437 etc. MoRI is AMD's inference comms library built from first principles by their 10x china team! In the age of inference, developer velocity of inference optimizations matters a lot & InferenceX™ is our research platform to continuously track the performance.
SemiAnalysis tweet media
English
9
15
176
15.7K
Matt Elliott retweetet
Anush Elangovan
Anush Elangovan@AnushElangovan·
Inspired by @__tinygrad__ userspace AMD driver, I clauded a userspace driver for some stress testing of SDMA and compute/comms overlap debug. I didn't open the editor once. Agents are the great equalizer in software. And Speed is the moat. github.com/ROCm/TheRock/t…
English
10
15
203
12.2K
Matt Elliott retweetet
the tiny corp
the tiny corp@__tinygrad__·
AMD open sourced rocprof-trace-decoder! This was one of the last pieces of closed source code on the CPU side -- the definitions of the hardware SQTT traces are now public. AMD's tracing infrastructure is better than NVIDIA's, it can trace the timing of every instruction.
English
12
52
1.2K
52.1K
Matt Elliott retweetet
Meta Newsroom
Meta Newsroom@MetaNewsroom·
Today, we’re announcing a long-term agreement with @AMD to power our AI infrastructure with up to 6GW of AMD Instinct GPUs, helping us build a flexible, resilient tech stack for our AI workloads. about.fb.com/news/2026/02/m…
English
26
49
281
10.2K
Matt Elliott retweetet
Eyvonne Sharp
Eyvonne Sharp@SharpNetwork·
Hard day. So I set my table with my yard sale Christmas dishes I got for $35.
Eyvonne Sharp tweet media
English
3
0
14
438