UW SyFi

27 posts

UW SyFi

UW SyFi

@UWSyFi

The Systems for Future Intelligence lab at @uwcse.

Seattle, Washington Bergabung Şubat 2026
13 Mengikuti69 Pengikut
UW SyFi
UW SyFi@UWSyFi·
🛠️ Our release includes trace examples, the collector/sanitizer, the full anonymized trace with analysis scripts and a chatbot, and a replay client for serving engines such as vLLM and SGLang. 🔍 This trace is an early look at real coding-agent traffic: self-driven loops, long-context short-output rounds, long-tailed tool execution, and imperfect prefix caching. It is also biased toward our own projects and habits, which is why we are releasing the full pipeline. 🤝 If you use Claude Code, Codex, or another coding agent, try TraceLab on your own logs, share a sanitized trace if you are comfortable, and help turn this first data point into a shared community resource. (7/n)
English
1
0
0
212
UW SyFi
UW SyFi@UWSyFi·
🔥 Coding agents have become one of the hottest LLM workloads. But serving them looks nothing like serving a chatbot: 294× more input than output, hundreds of thousands of tool calls, and extremely long-tailed latency. 🚀 We are releasing the SyFI Coding Trace: ~4,300 real-world coding-agent sessions from our daily use, plus TraceLab, an open-source pipeline to collect, sanitize, analyze, and replay your own traces. More in the thread below 🧵👇 (1/n)
UW SyFi tweet media
English
3
13
24
3.9K
UW SyFi me-retweet
Stephanie Wang
Stephanie Wang@thepadawang·
M* is a new system for multimodal inference from our lab @uwsyfi. The system captures multimodal models as dataflow graphs, implements a generic engine for those graphs, and achieves SOTA results inference throughput/latency. Learn more here! m-star.org
Keisuke Kamahori@KeisukeKamahori

New multimodal model architectures shouldn't require new serving systems. Introducing our work, M* (M-Star): a universal serving system for multimodal models that separates what a model computes - a dataflow graph - from how it runs: placement, scheduling, batching, and transport. Joint work across @uwcse, @StanfordAILab, and @CMU_ECE with Atindra Jha, Naomi Sagan, Irmak Sivgin, Rohan Sanda, @ste_veng, Mark Horowitz, @LukeZettlemoyer, Olivia Hsu, @jure, @bariskasikci, and @thepadawang.

English
0
2
3
267
UW SyFi me-retweet
Keisuke Kamahori
Keisuke Kamahori@KeisukeKamahori·
Excited to share that I’ll be interning at @nvidia this summer at the Santa Clara HQ, working on GPU architecture! If you’re in the Bay Area, I’d love to grab coffee! Always happy to chat about agents, ML systems, GPU architecture, or anything in between. #NVIDIALife
English
1
5
23
2.8K
UW SyFi me-retweet
Keisuke Kamahori
Keisuke Kamahori@KeisukeKamahori·
New multimodal model architectures shouldn't require new serving systems. Introducing our work, M* (M-Star): a universal serving system for multimodal models that separates what a model computes - a dataflow graph - from how it runs: placement, scheduling, batching, and transport. Joint work across @uwcse, @StanfordAILab, and @CMU_ECE with Atindra Jha, Naomi Sagan, Irmak Sivgin, Rohan Sanda, @ste_veng, Mark Horowitz, @LukeZettlemoyer, Olivia Hsu, @jure, @bariskasikci, and @thepadawang.
Keisuke Kamahori tweet media
English
2
14
29
5.6K
UW SyFi
UW SyFi@UWSyFi·
Unlike current frameworks, Piper correctly composes pipeline parallelism with ZeRO-2 and ZeRO-3 memory optimizations. In our experiments on Qwen3 9B, Piper encodes correct sharding semantics and supports larger batch sizes where Megatron, DeepSpeed, and TorchTitan fall short.
UW SyFi tweet media
English
1
1
1
160
UW SyFi
UW SyFi@UWSyFi·
New distributed training strategies should not require new distributed runtimes. Introducing Piper: a programmable PyTorch training system for deploying complex training strategies by separating model placement and GPU scheduling from model code. 📄 arxiv.org/abs/2606.11169
UW SyFi tweet media
English
1
15
51
4.3K
UW SyFi me-retweet
Mathew Jacob
Mathew Jacob@mat_jacob1002·
If there’s one thing you should do to learn how to build performant systems in this AI era, it is following @KeisukeKamahori @sudopowr and @MichaelGu341332!
Baris Kasikci@bariskasikci

Super stoked that UW SyFI (syfi.cs.washington.edu) members won a number of prizes at the MLSys'26 competition, NVIDIA Track. Hugre congrats to @KeisukeKamahori , @sudopowr , Yile Gu, Wei Shen, Steven Gao! Thanks to @nvidia , @modal , and the Flashinfer team for the support. 1st place in the GDN Track — Full-Agent Approach 2nd place in the GDN Track — Agent-Assisted Approach 3rd place in the DSA Track — Full-Agent Approach

English
0
1
12
1K
UW SyFi me-retweet
Baris Kasikci
Baris Kasikci@bariskasikci·
Super stoked that UW SyFI (syfi.cs.washington.edu) members won a number of prizes at the MLSys'26 competition, NVIDIA Track. Hugre congrats to @KeisukeKamahori , @sudopowr , Yile Gu, Wei Shen, Steven Gao! Thanks to @nvidia , @modal , and the Flashinfer team for the support. 1st place in the GDN Track — Full-Agent Approach 2nd place in the GDN Track — Agent-Assisted Approach 3rd place in the DSA Track — Full-Agent Approach
Baris Kasikci tweet media
English
3
6
38
9.7K
UW SyFi me-retweet
Keisuke Kamahori
Keisuke Kamahori@KeisukeKamahori·
Very excited to share that our team at @UWSyFi won multiple prizes at the FlashInfer AI Kernel Generation Contest in #MLSys2026! Huge thanks for organizing an amazing contest @ye_combinator @yi_xin_dong @charles_irl
Baris Kasikci@bariskasikci

Super stoked that UW SyFI (syfi.cs.washington.edu) members won a number of prizes at the MLSys'26 competition, NVIDIA Track. Hugre congrats to @KeisukeKamahori , @sudopowr , Yile Gu, Wei Shen, Steven Gao! Thanks to @nvidia , @modal , and the Flashinfer team for the support. 1st place in the GDN Track — Full-Agent Approach 2nd place in the GDN Track — Agent-Assisted Approach 3rd place in the DSA Track — Full-Agent Approach

English
2
5
28
3.4K