Sidharth Babu

24 posts

Sidharth Babu

Sidharth Babu

@Sbabu2020

ML Systems Ruminator, MS-ECE @CMU | Prev @Adobe, @Keysight, @NASA, @UTAustin

Pittsburgh, PA Katılım Ağustos 2016
360 Takip Edilen20 Takipçiler
Steven Kolawole
Steven Kolawole@_stevenkolawole·
I'll be interning with AWS Bedrock as an applied scientist in SF!! 🥵 I'll be working with the inference optimization science team on spec decoding + MoE inference. Deeply humbled and honored to announce that I'm deeply humbled and honored.
Steven Kolawole tweet mediaSteven Kolawole tweet media
English
84
105
921
26.6K
Sidharth Babu
Sidharth Babu@Sbabu2020·
Model guys making models with a hidden dimension that’s not a power of two makes me feel better about systems guys keeping their jobs for a little longer 🫠
English
0
0
1
28
Sidharth Babu retweetledi
Zhihao Jia
Zhihao Jia@JiaZhihao·
🚀Introducing Motus, the open-source agent infrastructure that learns in production. Existing agent infra serves static agents: the harness, model, and workflow are fixed after deployment. But static agents degrade over time. The harness goes stale, new models go unincorporated, context drifts, and latency compounds. Motus closes this gap by learning from every trace (failures, latency, cost, and task outcomes) and using those signals to continuously optimize agent harness, model orchestration, context memory, and end-to-end latency. Early results: higher accuracy than any single frontier model at 2.3× lower cost (Terminal-Bench 2.0, SWE-bench Verified), with 52% lower latency and 45% better memory recall. Open source under Apache 2.0. Works with any agent SDK. Deploy with one command. github.com/lithos-ai/motus lithosai.com
Zhihao Jia tweet media
English
22
71
565
56K
Sidharth Babu
Sidharth Babu@Sbabu2020·
A few hours of memory aliasing bugs later, Qwen3.5 is available on MLC-LLM! Take a look at llm.mlc.ai - always cool to see these things run on your own hardware. Big thanks to our community contributors!
English
1
1
1
54
Sidharth Babu retweetledi
Zhihao Jia
Zhihao Jia@JiaZhihao·
Excited to see our inaugural CMU Catalyst Research Summit bring together 120+ attendees! A full day of discussions on the future of agentic AI systems, multi-modal AI, and ML compilation—with amazing energy from both academia and industry. Co-organized with @tqchenml @BeidiChen @Tim_Dettmers — this is just the beginning 🚀
Zhihao Jia tweet media
English
2
12
87
23.8K
Sidharth Babu retweetledi
Zhihao Jia
Zhihao Jia@JiaZhihao·
The MLSys’26 program is live! Check out the accepted papers: mlsys.org/virtual/2026/p… This year marks several exciting firsts: • 28 industry track papers bridging MLSys research & real-world deployment • Our inaugural competition track featuring AWS Trainium, Google Graph Scheduling, and NVIDIA FlashInfer AI Kernel contests Early registration deadline: April 1 — don’t miss it! See you in Seattle this May🌲
Zhihao Jia tweet media
English
1
25
142
17.8K
Dillon Mulroy
Dillon Mulroy@dillon_mulroy·
i think i’m back to wanting a really good tab model - any progress here outside of cursor (i don’t have access to supermaven) and for nvim?
English
50
2
300
67.8K
Sidharth Babu retweetledi
Tanishq Kumar
Tanishq Kumar@tanishqkumar07·
I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.
English
135
456
4.1K
608.9K
Sidharth Babu retweetledi
Tri Dao
Tri Dao@tri_dao·
The FA4 paper is finally out after a year of work. On Blackwell GPUs, attention now goes about as fast as matmul even though the bottlenecks are so different! Tensor cores are now crazy fast that attn fwd is bottlenecked by exponential, and attn bwd is bottlenecked by shared memory bandwidth.  Some fun stuff in the redesigned algorithm to overcome these bottlenecks: exponential emulation with polynomials, new online softmax to avoid 90% of softmax rescaling, 2CTA MMA instructions that allow two thread blocks to share operands to reduce smem traffic.
Ted Zadouri@tedzadouri

Asymmetric hardware scaling is here. Blackwell tensor cores are now so fast, exp2 and shared memory are the wall. FlashAttention-4 changes the algorithm & pipeline so that softmax & SMEM bandwidth no longer dictate speed. Attn reaches ~1600 TFLOPs, pretty much at matmul speed! joint work w/ Markus Hoehnerbach, Jay Shah(@ultraproduct), Timmy Liu, Vijay Thakkar (@__tensorcore__ ), Tri Dao (@tri_dao) 1/

English
31
229
1.8K
188.1K
Sidharth Babu
Sidharth Babu@Sbabu2020·
@yminsky The answer is @Railway! The deployment process of your own code basically boils down to just a git push, they have a CLI and MCP server for agents to use, and @JustJake and team are super responsive to user concerns. Also has baked in templated services, one click Postgres etc.
English
0
0
0
18
Yaron (Ron) Minsky
Yaron (Ron) Minsky@yminsky·
So, where is the application hosting platform of the future that's optimized for vibe coding? There are a ton of little applications I'd love to build for myself if the costs and annoyances of setting up services and permissions were mitigated.
English
38
1
87
42.7K
Sidharth Babu retweetledi
Saksham
Saksham@sgdescent·
Started a ml sys reading group with friends @SCSatCMU Systolic arrays are so cool!
Saksham tweet media
English
5
5
99
5.1K
Sidharth Babu
Sidharth Babu@Sbabu2020·
@wcsDirofSchools When sending a tip to this line it sends it to "Winnebago County, IL Sheriffs office"? You may want to inquire as to why it doesn't seem to work
English
0
0
0
0
Sidharth Babu
Sidharth Babu@Sbabu2020·
@linc444 @Wegner_Jeff1 @DavidTresch @wcsDirofSchools I would like to add on that according to Christopher Ferguson of Stetson University, the methodology of the majority of violence-video game studies are actually inherently flawed and do not produce scientific data.
English
0
0
0
0
Sidharth Babu
Sidharth Babu@Sbabu2020·
@wcsDirofSchools I implore you and anyone else with political power, enforce and reform our policies if you want to see real improvement. Otherwise, nothing will change and tragedies will inevitably still occur.
English
1
0
1
0