Yi Pan

45 posts

Yi Pan banner
Yi Pan

Yi Pan

@conlesspan

PhD Student @BerkeleySky on systems and AI; Prev @UWSyFI @sjtu1896

Sumali Ağustos 2019
291 Sinusundan80 Mga Tagasunod
Yi Pan nag-retweet
Guilherme Favaron
Every team shipping a coding agent — Claude Code, Codex, Cursor — is really running a serving-systems problem. The "tech behind the tech" is the LLM-serving stack underneath, and until now nobody had real data on what that workload looks like. New arXiv (2606.30560) from @bariskasikci's SyFI lab (@UWSyFi, @uwcse) is the first large cross-provider trace of real coding-agent use: ~4,300 sessions, 350K LLM steps, 430K tool calls, 43 developers, 8 months, Claude Code + Codex. It breaks the intuition that agents mean long generations. The median step replays ~119K context tokens to emit just ~214 output tokens — two orders of magnitude more reading than writing. So the bill is the context, not the answer: prefix tokens are 59.5% of total cost. Tool calls are brutally long-tailed: 80+ tools, but the top 3 are 80%+ of calls, and the 4% of calls that run >1 min eat 85% of all tool time. And the prefix cache everyone leans on? 95.7% hit rate — yet misses cluster right after a human pauses to think, amplifying prefill 3.8x. Those human-gap misses alone are ~46% of fresh tokens and ~13% of spend. For technical leaders: your agent's cost and latency live in the loop, the replayed context, and the idle gaps — not raw token generation. Tune tool-call overhead, append-length-aware prefill, and KV-cache eviction around human gaps before you scale the fleet.
Guilherme Favaron tweet media
English
3
3
9
803
Yi Pan nag-retweet
Yi Pan nag-retweet
Xiangfeng Zhu
Xiangfeng Zhu@xiangfeng_zhu·
I feel incredibly fortunate to have had two advisors whose unwavering support shaped my PhD journey. They gave me the freedom to explore, take risks, and occasionally disappear down research rabbit holes. Thank you, @ratulm and @arvind_uw ! Now, on to the next phase :)
Ratul Mahajan@ratulm

Rituals are silly, but fun too. Here are @arvind_uw and I hooding our PhD student, @xiangfeng_zhu. His thesis showed how to design and implement networks that are hyper-customized to applications' needs rather than requiring applications to work around whatever the network stack happens to provide. He is now off to help machines think at Thinking Machines. Good luck, Xiangfeng!

English
3
1
29
30.5K
Yi Pan
Yi Pan@conlesspan·
@cHHillee This is related to #issuecomment-3723524517" target="_blank" rel="nofollow noopener">github.com/google/perfett…. My current workaround is using a script to manually adjust the timestamps to ensure no overlapping 🫠
English
1
0
2
178
Horace He
Horace He@cHHillee·
I'm not sure how useful this is but it certainly would have been useful for me last year... If a chrome trace has events that overlap on the same stream (e.g. event1 ends at 10.1 and event2 begins at 10.05), perfetto's behavior is to not show the events and show an empty gap in your trace >:(
English
24
5
160
20.5K
Yi Pan
Yi Pan@conlesspan·
Huge congrats to the team!!!
Baris Kasikci@bariskasikci

Super stoked that UW SyFI (syfi.cs.washington.edu) members won a number of prizes at the MLSys'26 competition, NVIDIA Track. Hugre congrats to @KeisukeKamahori , @sudopowr , Yile Gu, Wei Shen, Steven Gao! Thanks to @nvidia , @modal , and the Flashinfer team for the support. 1st place in the GDN Track — Full-Agent Approach 2nd place in the GDN Track — Agent-Assisted Approach 3rd place in the DSA Track — Full-Agent Approach

English
0
0
2
107
Yi Pan nag-retweet
Vic Shihang Li
Vic Shihang Li@sudopowr·
Today's AI agents can diagnose production incidents, but they start from scratch every single time. What if they could remember? New on @acmsigops: our work on the Self-Defining Operator, a multi-agent system with long-term memory for autonomous ops.
ACM SIGOPS@ACMSIGOPS

New SIGOPS Blog -- "The Long Game: How Agents That Remember Resolve Operational Issues Faster" by Shihang (Vic) Li, Thomas Anderson, Ratul Mahajan, Simon Peter, Luke Zettlemoyer, and the SDS team. sigops.org/2026/the-long-…

English
3
9
18
3.1K
Ying Sheng
Ying Sheng@ying11231·
We've been running @radixark for a few months, started by many core developers in SGLang @lmsysorg and its extended ecosystem (slime @slime_framework , AReaL @jxwuyi). I left @xai in August — a place where I built deep emotions and countless beautiful memories. It was the best place I’ve ever worked, the place I watched grow from a few dozen people to hundreds, and it truly felt like home. What pushed me to make such a hard decision is the momentum of building SGLang open source and the mission of creating an ambitious future, within an open spirit that I learnt from my first job at @databricks after my PhD. We started SGLang in the summer of 2023 and made it public in January 2024. Over the past 2 years, hundreds of people have made great efforts to get to where they are today. We experienced several waves of growth after its first release. I still remember the many dark nights in the summer of 2024, I spent with @lm_zheng , @lsyincs , and @zhyncs42 debugging, while @ispobaoke single-handedly took on DeepSeek inference optimizations, seeing @GenAI_is_real and the community strike team tag-teaming on-call shifts non-stop. There are so many more who have joined that I'm out of space to call out, but they're recorded on the GitHub contributor list forever. The demands grow exponentially, and we have been pushed to make it a dedicated effort supported by RadixArk. It’s the step-by-step journey of a thousand miles that has carried us here today, and the same relentless Long March that will lead us into the tens of thousands of miles yet to come. The story never stops growing. Over the past year, we’ve seen something very clear: The world is full of people eager to build AI, but the infrastructure that makes it possible is not shared. The most advanced inference and training stacks live inside a few companies. Everyone else is forced to rebuild the same schedulers, compilers, serving engines, and training pipelines again and again — often under enormous pressure, with lots of duplicated effort and wasted insight. RadixArk was born to change that. Today, we’re building an infrastructure-first, deep-tech company with a simple and ambitious mission: "Make frontier-level AI infrastructure open and accessible to everyone." If the two values below resonate with you, come talk to us: (1) Engineering as an art. Infrastructure is a first-class citizen in RadixArk. We care about elegant design and code that lasts. Beneath every line of code lies the soul of the engineer who wrote it. (2) A belief in openness. We share what we build. We bet on long-term compounding through community, contribution, and giving more than we take. A product is defined by its users, yet it truly comes alive the moment functionality transcends mere utility and begins to embody aesthetics. Thanks to all the miles (the name of our first released RL framework; see below). radixark.ai
English
116
131
1.2K
551.5K
Yi Pan nag-retweet
Baris Kasikci
Baris Kasikci@bariskasikci·
How to beat all compression using LLMs? ⚙️ Introducing LLMc — a lossless compressor built with LLMs. LLMc leverages the predictive power of LLMs to beat traditional compressors like Gzip and LZMA on natural language text. (1/4) 🔗 Blog Post: syfi.cs.washington.edu/blog/2025-10-0… 💻 Code: github.com/uw-syfi/LLMc
English
2
5
22
3.3K
ChatGPT辽太郎
ChatGPT辽太郎@jian_w3ng·
计算机系统:安卓方向;人工智能:苹果方向
中文
4
1
29
3K
Yi Pan
Yi Pan@conlesspan·
@iskyzh Just had my dinner there😋
English
0
0
1
153
迟猫猫🐱
迟猫猫🐱@iskyzh·
a sip of Bellevue 🤪 I love this place (only in summer)
迟猫猫🐱 tweet media迟猫猫🐱 tweet media迟猫猫🐱 tweet media迟猫猫🐱 tweet media
English
6
0
57
6.3K
Baris Kasikci
Baris Kasikci@bariskasikci·
🚀 Presenting LiteASR: a method that halves the compute cost of speech encoders by 2x, leveraging low-rank approximation of activations. LiteASR is accepted to #EMNLP2025 (main) @emnlpmeeting
Baris Kasikci tweet media
English
8
6
41
7.5K
Yi Pan nag-retweet
Tianyin Xu
Tianyin Xu@tianyin_xu·
A petition to SIGOPS to adopt the USENIX Annual Technical Conference (ATC) and retain its steering committee docs.google.com/document/d/1wK… (not sure whether it can be done by SIGOPS alone, but it's great to let the voice be heard)
English
2
15
60
6.2K
Baris Kasikci
Baris Kasikci@bariskasikci·
Grateful to the DSN community for the rising star recognition! Huge thanks to the letter writers, organizers, selection committee, all my collaborators, advisor, and most importantly my group members, which make it all possible!
Saurabh Bagchi@bagchi_saurabh

IEEE/IFIP DSN conference @DsnIeee just wrapped up in Naples. The Rising Star award, given to someone less than 10 years from graduation, went to Baris Kasikci @bariskasikci of University of Washington for his contributions to theory and industrial impact of dependability. I chaired the committee and thanks to the members for a diligent process to arrive at the winner. Miguel P. Correia (University of Lisbon) @miguelnpcorreia, Bianca Schroeder (University of Toronto), Amith Singhee (IBM Research, India) @asinghee1, Angelos Stavrou (Virginia Tech) @AngelosStavrou.

English
2
2
30
3.6K
Banghua Zhu
Banghua Zhu@BanghuaZ·
Excited to share that I’m joining NVIDIA as a Principal Research Scientist! We’ll be joining forces on efforts in model post-training, evaluation, agents, and building better AI infrastructure—with a strong emphasis on collaboration with developers and academia. We’re committed to open-sourcing our work and sharing it with the world. Let’s build a stronger, more open AI community together!
Banghua Zhu tweet media
English
141
95
2.5K
249.8K