Senthilkumar Gopal

1.2K posts

Senthilkumar Gopal banner
Senthilkumar Gopal

Senthilkumar Gopal

@sengopal

❤️ to code and solve new problems everyday @NVIDIAAI for planet scale distributed LLM inference | @GeorgiaTech | Opinions only my own.

California, USA 参加日 Mayıs 2009
134 フォロー中234 フォロワー
Senthilkumar Gopal がリツイート
Chris Fregly
Chris Fregly@cfregly·
Join me today at 9am PT for an awesome set of talks on performance highlights from @nvidia GTC 2026 and AI Inference including disaggregated refill-decode with NVIDIA GPUs and RadixAttention meetup.com/ai-performance…
English
1
2
9
615
Senthilkumar Gopal がリツイート
NVIDIA Newsroom
NVIDIA Newsroom@nvidianewsroom·
.@togethercompute is bringing its inference research with NVIDIA Dynamo 1.0 to deliver an accelerated, cost-effective inference stack for production AI workloads.
NVIDIA Newsroom tweet media
English
2
3
16
2.7K
Senthilkumar Gopal がリツイート
LMSYS Org
LMSYS Org@lmsysorg·
Excited to share our latest collaboration blog with @NVIDIA on how SGLang unlocks massive inference performance gains on GB300 NVL72 (Blackwell Ultra) vs H200 in InferenceXv2! Results: 1️⃣25× throughput on GB300 NVL72 vs H200 @ 50 TPS/user 2️⃣8× performance gain on GB200 NVL72 in under 4 months 3️⃣4× TPS/User improvement in high interactivity regime on GB200 NVL72 Key techniques include: 🧠 NVFP4 GEMM optimizations tailored for MoE reasoning models 🔄 Computation–communication overlap tuned specifically for NVL72 🚀 Deep integration with NVIDIA Dynamo for disaggregated inference Huge thanks to the @NVIDIAAIDev and SGLang teams for making this happen 🙌
LMSYS Org tweet mediaLMSYS Org tweet mediaLMSYS Org tweet media
English
3
11
73
19.9K
Senthilkumar Gopal がリツイート
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
🧵 NVIDIA Dynamo v0.9.0 is live and it's probably our biggest infrastructure upgrade yet. Highlights this time include ✅ Sneak preview of FlashIndexer ✅ Expanded multi-modal support ✅ Removed NATS & ETCD And bonus. . . @meituan (Chinese Doordash + LLM builders) recently dropped an OSS inference engine built on @sgl_project + Dynamo. 👇
NVIDIA AI Developer tweet media
English
2
13
85
5.1K
Senthilkumar Gopal がリツイート
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
🙌 Thank you, @baseten, for being an engaged and impactful contributor in the NVIDIA Dynamo ecosystem. By running Dynamo at scale and sharing learnings from real customer workloads, you’ve helped strengthen the project for the broader community. We’ve already seen the benefits--faster TTFT, lower per-token latency, and higher throughput on long-context workloads—along with valuable contributions across Dynamo, TensorRT LLM, and your open-sourced Suffix Automaton–based MTP accelerator. Learn more 👇
Baseten@baseten

Thanks @NVIDIAAI for inviting us to Dynamo Day! We're active users of Dynamo, iterating on it in production for performance gains like 50% lower TTFT and 34% lower TPOT, and regularly shipping our work back to the community. Read some of our highlights from Dynamo Day and working with NVIDIA Dynamo here: baseten.co/blog/nvidia-dy…

English
4
5
38
3.3K
Senthilkumar Gopal がリツイート
Vijay Janapa Reddi
Vijay Janapa Reddi@profvjreddi·
Today I’m sharing Tiny🔥Torch—an educational framework for ML systems, built from scratch. You don’t just train models, you build tensors, autograd, optimizers, and data loaders, and see how design choices affect memory, performance, and efficiency. If you use @PyTorch or @TensorFlow, this helps learners see what’s really happening under the hood. Too many students learn how to use ML frameworks, but never how to build one. Tiny🔥Torch is about closing that gap. Early, open, and still evolving, looking for fellow educators and learners. Ideas and help welcome 🙏 mlsysbook.ai/tinytorch/intr…
English
30
168
1.3K
79K
Senthilkumar Gopal
Senthilkumar Gopal@sengopal·
I just realized that the volume of work done by creatives, as described by @AdamMGrant in #Originals is due to "Agency" 😃
Andrej Karpathy@karpathy

Agency > Intelligence I had this intuitively wrong for decades, I think due to a pervasive cultural veneration of intelligence, various entertainment/media, obsession with IQ etc. Agency is significantly more powerful and significantly more scarce. Are you hiring for agency? Are we educating for agency? Are you acting as if you had 10X agency? Grok explanation is ~close: “Agency, as a personality trait, refers to an individual's capacity to take initiative, make decisions, and exert control over their actions and environment. It’s about being proactive rather than reactive—someone with high agency doesn’t just let life happen to them; they shape it. Think of it as a blend of self-efficacy, determination, and a sense of ownership over one’s path. People with strong agency tend to set goals and pursue them with confidence, even in the face of obstacles. They’re the type to say, “I’ll figure it out,” and then actually do it. On the flip side, someone low in agency might feel more like a passenger in their own life, waiting for external forces—like luck, other people, or circumstances—to dictate what happens next. It’s not quite the same as assertiveness or ambition, though it can overlap. Agency is quieter, more internal—it’s the belief that you *can* act, paired with the will to follow through. Psychologists often tie it to concepts like locus of control: high-agency folks lean toward an internal locus, feeling they steer their fate, while low-agency folks might lean external, seeing life as something that happens *to* them.”

English
0
0
1
35
Senthilkumar Gopal がリツイート
vLLM
vLLM@vllm_project·
🤩 checkout this blog from @awscloud about scaling Rufus, which is powered by @vllm_project on Inferentia and Trainium! Serving 3 million tokens a minute! "Within each container, an NVIDIA Triton Inference Server with a Python backend is used running vLLM with the Neuron SDK. vLLM is a memory-efficient inference and serving engine that is optimized for high throughput." "These choices allowed Rufus to scale up over 80,000 Trainium and Inferentia chips across three Regions serving an average of 3 million tokens a minute while maintaining P99 less than 1 second latency to the first response for Prime Day customers." aws.amazon.com/blogs/machine-…
English
1
11
36
10.5K
Senthilkumar Gopal
Senthilkumar Gopal@sengopal·
Ah.. but the distribution of potential answers is known.. 😀 doesn't matter if it is random numbers or picking a response to an open ended question. With copious amount of data, central limit theorum always wins..
Andrej Karpathy@karpathy

Consider being a labeler for an LLM. The prompt is “give me a random number between 1 and 10”. What SFT & RM labels do you contribute? What does this do the network when trained on? In subtle way this problem is present in every prompt that does not have a single unique answer.

English
0
0
0
97
Barstool Georgia Tech
Barstool Georgia Tech@BarstoolGT·
Who’s the best GT Professor you’ve ever had? ⬇️
English
52
0
23
17K
Senthilkumar Gopal がリツイート
Jeremy Howard
Jeremy Howard@jeremyphoward·
Big news: we're launching a new course in <4 weeks. "From Deep Learning Foundations to Stable Diffusion". Bigger news: for this course, we're teaming up with @StabilityAI! AFAIK, this is the 1st course that covers every method used in Stable Diffusion. fast.ai/posts/part2-20…
English
39
608
3.1K
0
Senthilkumar Gopal がリツイート
eBay Tech
eBay Tech@ebaytech·
Our Coded Coupons tool gives sellers flexibility and control over how they offer discounts to customers. ebayinc.to/3ADrd1G
English
0
3
2
0
Senthilkumar Gopal
Senthilkumar Gopal@sengopal·
@karpathy @blakecrouch1 - Recursion, Dark Matter and recently Upgrade. Excellent novels with a great scientific foundation , intertwined with human psychology and paced wonderfully.
English
0
0
0
0