
Senthilkumar Gopal
1.2K posts

Senthilkumar Gopal
@sengopal
❤️ to code and solve new problems everyday @NVIDIAAI for planet scale distributed LLM inference | @GeorgiaTech | Opinions only my own.












Thanks @NVIDIAAI for inviting us to Dynamo Day! We're active users of Dynamo, iterating on it in production for performance gains like 50% lower TTFT and 34% lower TPOT, and regularly shipping our work back to the community. Read some of our highlights from Dynamo Day and working with NVIDIA Dynamo here: baseten.co/blog/nvidia-dy…


Agency > Intelligence I had this intuitively wrong for decades, I think due to a pervasive cultural veneration of intelligence, various entertainment/media, obsession with IQ etc. Agency is significantly more powerful and significantly more scarce. Are you hiring for agency? Are we educating for agency? Are you acting as if you had 10X agency? Grok explanation is ~close: “Agency, as a personality trait, refers to an individual's capacity to take initiative, make decisions, and exert control over their actions and environment. It’s about being proactive rather than reactive—someone with high agency doesn’t just let life happen to them; they shape it. Think of it as a blend of self-efficacy, determination, and a sense of ownership over one’s path. People with strong agency tend to set goals and pursue them with confidence, even in the face of obstacles. They’re the type to say, “I’ll figure it out,” and then actually do it. On the flip side, someone low in agency might feel more like a passenger in their own life, waiting for external forces—like luck, other people, or circumstances—to dictate what happens next. It’s not quite the same as assertiveness or ambition, though it can overlap. Agency is quieter, more internal—it’s the belief that you *can* act, paired with the will to follow through. Psychologists often tie it to concepts like locus of control: high-agency folks lean toward an internal locus, feeling they steer their fate, while low-agency folks might lean external, seeing life as something that happens *to* them.”


Consider being a labeler for an LLM. The prompt is “give me a random number between 1 and 10”. What SFT & RM labels do you contribute? What does this do the network when trained on? In subtle way this problem is present in every prompt that does not have a single unique answer.

vLLM v0.3.3 is released with Starcoder2 @BigCodeProject and Inferentia @awscloud support. I'm also excited about the addition of guided decoding* (JSON, regex) in server leveraging @OutlinesOSS. *experimental, the schema take some time to compile but will be cached.

I think TPU and Trainium optimization is even more fun than GPU optimization. The architectures are simpler and more like a puzzle, and our performance analysis tools are better than GPU ones. See Trainium's trace of every instruction: awsdocs-neuron.readthedocs-hosted.com/en/latest/tool…



So proud of my friend Charles Isbell @isbellHFh , who will be U. Wisconsin's next provost, starting this Fall. For those not steeped in academic terminology, that's basically the school's CEO (one step below president). news.wisc.edu/charles-lee-is…








