Sidharth Babu

24 posts

Sidharth Babu

@Sbabu2020

ML Systems Ruminator, MS-ECE @CMU | Prev @Adobe, @Keysight, @NASA, @UTAustin

Pittsburgh, PA Katılım Ağustos 2016

360 Takip Edilen20 Takipçiler

Sidharth Babu@Sbabu2020·21h

@_stevenkolawole Congratulations Steven!!

English

Steven Kolawole@_stevenkolawole·1d

I'll be interning with AWS Bedrock as an applied scientist in SF!! 🥵 I'll be working with the inference optimization science team on spec decoding + MoE inference. Deeply humbled and honored to announce that I'm deeply humbled and honored.

English

105

921

26.6K

Sidharth Babu@Sbabu2020·17 Nis

Model guys making models with a hidden dimension that’s not a power of two makes me feel better about systems guys keeping their jobs for a little longer 🫠

English

Sidharth Babu retweetledi

Zhihao Jia@JiaZhihao·14 Nis

🚀Introducing Motus, the open-source agent infrastructure that learns in production. Existing agent infra serves static agents: the harness, model, and workflow are fixed after deployment. But static agents degrade over time. The harness goes stale, new models go unincorporated, context drifts, and latency compounds. Motus closes this gap by learning from every trace (failures, latency, cost, and task outcomes) and using those signals to continuously optimize agent harness, model orchestration, context memory, and end-to-end latency. Early results: higher accuracy than any single frontier model at 2.3× lower cost (Terminal-Bench 2.0, SWE-bench Verified), with 52% lower latency and 45% better memory recall. Open source under Apache 2.0. Works with any agent SDK. Deploy with one command. github.com/lithos-ai/motus lithosai.com

English

565

56K

Sidharth Babu@Sbabu2020·2 Nis

Source PR: github.com/mlc-ai/mlc-llm…

English

Sidharth Babu@Sbabu2020·2 Nis

A few hours of memory aliasing bugs later, Qwen3.5 is available on MLC-LLM! Take a look at llm.mlc.ai - always cool to see these things run on your own hardware. Big thanks to our community contributors!

English

Sidharth Babu retweetledi

Zhihao Jia@JiaZhihao·29 Mar

Excited to see our inaugural CMU Catalyst Research Summit bring together 120+ attendees! A full day of discussions on the future of agentic AI systems, multi-modal AI, and ML compilation—with amazing energy from both academia and industry. Co-organized with @tqchenml @BeidiChen @Tim_Dettmers — this is just the beginning 🚀

English

23.8K

Sidharth Babu retweetledi

Zhihao Jia@JiaZhihao·26 Mar

The MLSys’26 program is live! Check out the accepted papers: mlsys.org/virtual/2026/p… This year marks several exciting firsts: • 28 industry track papers bridging MLSys research & real-world deployment • Our inaugural competition track featuring AWS Trainium, Google Graph Scheduling, and NVIDIA FlashInfer AI Kernel contests Early registration deadline: April 1 — don’t miss it! See you in Seattle this May🌲

English

142

17.8K

Sidharth Babu@Sbabu2020·19 Mar

@dillon_mulroy @swyx There’s a plugin here (I use Supermaven so I haven’t tested): github.com/Exafunction/wi…

English

Dillon Mulroy@dillon_mulroy·18 Mar

@swyx can i use this in neovim 👀

English

2.8K

Dillon Mulroy@dillon_mulroy·18 Mar

i think i’m back to wanting a really good tab model - any progress here outside of cursor (i don’t have access to supermaven) and for nvim?

English

300

67.8K

Sidharth Babu retweetledi

dax@thdxr·8 Mar

and yet my list of software i need but no one's built continues to grow

Rohan Pandey@khoomeik

a few friends are trying polyphasic sleep so they can supervise their coding agents 24/7

English

555

50.5K

Sidharth Babu retweetledi

Tanishq Kumar@tanishqkumar07·4 Mar

I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.

English

135

456

4.1K

608.9K

Sidharth Babu retweetledi

Tri Dao@tri_dao·5 Mar

The FA4 paper is finally out after a year of work. On Blackwell GPUs, attention now goes about as fast as matmul even though the bottlenecks are so different! Tensor cores are now crazy fast that attn fwd is bottlenecked by exponential, and attn bwd is bottlenecked by shared memory bandwidth. Some fun stuff in the redesigned algorithm to overcome these bottlenecks: exponential emulation with polynomials, new online softmax to avoid 90% of softmax rescaling, 2CTA MMA instructions that allow two thread blocks to share operands to reduce smem traffic.

Ted Zadouri@tedzadouri

Asymmetric hardware scaling is here. Blackwell tensor cores are now so fast, exp2 and shared memory are the wall. FlashAttention-4 changes the algorithm & pipeline so that softmax & SMEM bandwidth no longer dictate speed. Attn reaches ~1600 TFLOPs, pretty much at matmul speed! joint work w/ Markus Hoehnerbach, Jay Shah(@ultraproduct), Timmy Liu, Vijay Thakkar (@__tensorcore__ ), Tri Dao (@tri_dao) 1/

English

229

1.8K

188.1K

Sidharth Babu@Sbabu2020·3 Mar

@yminsky The answer is @Railway! The deployment process of your own code basically boils down to just a git push, they have a CLI and MCP server for agents to use, and @JustJake and team are super responsive to user concerns. Also has baked in templated services, one click Postgres etc.

English

Yaron (Ron) Minsky@yminsky·3 Mar

So, where is the application hosting platform of the future that's optimized for vibe coding? There are a ton of little applications I'd love to build for myself if the costs and annoyances of setting up services and permissions were mitigated.

English

42.7K

Sidharth Babu retweetledi

Saksham@sgdescent·27 Şub

Started a ml sys reading group with friends @SCSatCMU Systolic arrays are so cool!

English

5.1K

Sidharth Babu@Sbabu2020·19 Şub

@wcsDirofSchools When sending a tip to this line it sends it to "Winnebago County, IL Sheriffs office"? You may want to inquire as to why it doesn't seem to work

English

Sidharth Babu@Sbabu2020·19 Şub

@linc444 @Wegner_Jeff1 @DavidTresch @wcsDirofSchools I would like to add on that according to Christopher Ferguson of Stetson University, the methodology of the majority of violence-video game studies are actually inherently flawed and do not produce scientific data.

English

Sidharth Babu@Sbabu2020·19 Şub

@wcsDirofSchools Source article: rollingstone.com/glixel/feature… Authors : Patrick M. Markey, a professor of psychology at Villanova University, and Christopher J. Ferguson, a professor at Stetson University

English

Sidharth Babu@Sbabu2020·19 Şub

@wcsDirofSchools I implore you and anyone else with political power, enforce and reform our policies if you want to see real improvement. Otherwise, nothing will change and tragedies will inevitably still occur.

English

Sidharth Babu@Sbabu2020·19 Şub

English

Sidharth Babu@Sbabu2020·19 Şub

@DavidTresch @RaineyMatthew @wcsDirofSchools While that may be true, the below pdf from Christopher Ferguson (professor at Stetson University) explains how the majority of video game research studies have been flawed in methodology. christopherjferguson.com/Angry%20Birds.…

English

Sidharth Babu@Sbabu2020·19 Şub

English

Keşfet

@_stevenkolawole @tqchenml @BeidiChen @Tim_Dettmers @dillon_mulroy @swyx @tri_dao @avnermay