Pankaj Gupta

601 posts

Pankaj Gupta banner
Pankaj Gupta

Pankaj Gupta

@defpan

Co-founder @basetenco working on ML model performance

Bay Area 가입일 Eylül 2011
899 팔로잉326 팔로워
Pankaj Gupta
Pankaj Gupta@defpan·
@rapprach Absolutely thrilled about this milestone. This is a true turning point for cold starts. What’s coming next is going to be truly mind blowing. Watch this space!
English
1
0
4
97
Pankaj Gupta 리트윗함
AT
AT@AliesTaha·
- 230 training runs - 1,623 GPU hours (67 B200 days) - 76 TB of training data - a 2x faster model Every paper said it can't be done. Quantization Aware Distillation made it possible.
AT@AliesTaha

x.com/i/article/2029…

English
19
106
1.2K
145K
Pankaj Gupta
Pankaj Gupta@defpan·
Had such a blast!
Baseten@baseten

Earlier this month, we hosted our biannual company-wide offsite and gathered 180 teammates in Austin, TX. Highlights included: > talent show > a chat with @saranormous about the evolution of the inference market > fireside chat with @EvidenceOpen > hackathon > a Texas ranch experience Within the last year, Baseten has moved faster than ever before. With 4X team growth, 12X revenue growth, and 3 separate fundraises, it's hard to believe how far we've come. At that pace, alignment doesn’t just happen. Our offsites enable us to celebrate wins, strengthen relationships across teams, and align on the next few months. And we're just getting started. If this sounds exciting to you, join us! baseten.co/careers

English
0
1
9
824
Pankaj Gupta 리트윗함
Baseten
Baseten@baseten·
We painted San Francisco green and pink, and the message is clear — you need to own your inference. If you spot us around the city, share a picture with us. We’ll send you something!
GIF
English
11
9
44
3.1K
Pankaj Gupta 리트윗함
World Labs
World Labs@theworldlabs·
We’re building foundational world models to power the next era of 3D. From robotics to gaming, spatial intelligence unlocks entirely new worlds. Powered by inference at scale – shoutout to Baseten.
English
11
26
210
19K
Pankaj Gupta 리트윗함
Jeff Huber
Jeff Huber@jeffreyhuber·
the bar has been raised for book printing thanks @philipkiely for the copy!
Jeff Huber tweet media
English
16
35
679
29.3K
Pankaj Gupta
Pankaj Gupta@defpan·
Inference is hard to learn because there are so many moving pieces. Now, you can see the whole stack in one place
Pankaj Gupta tweet media
English
0
0
1
20
Pankaj Gupta 리트윗함
Baseten
Baseten@baseten·
Generational AI companies are powered by Baseten. Why? We obsess over the milliseconds, so they can ship the future. Focus on what actually differentiates you. Leave the inference to us.
English
4
7
37
3.9K
Pankaj Gupta 리트윗함
AT
AT@AliesTaha·
we quantized the best open-source diffusion model on the market 4 bits huge speedup (almost) no quality loss this is a full explanation of the trillion dollar industry's oldest trick
AT tweet media
English
3
6
32
8K
Pankaj Gupta 리트윗함
Baseten
Baseten@baseten·
Introducing Kimi K2.5 on Baseten’s Model APIs with the most performant TTFT (0.26 sec) and TPS (340) on Artificial Analysis. Even among a landscape of incredible open source models, Kimi K2.5 stands out with its multi-modal capabilities and it's ability to accommodate an alarmingly large number of tool calls. Get the good stuff here: baseten.co/library/kimi-k…
Baseten tweet media
English
11
8
98
15.1K
Pankaj Gupta
Pankaj Gupta@defpan·
RT @tuhinone: The biggest hurdle to widespread AI adoption isn't just model capability, it's the cost and speed of inference. At Baseten, o…
English
0
1
0
30
Pankaj Gupta 리트윗함
Baseten
Baseten@baseten·
We boosted acceptance rate by up to 40% with the Baseten Speculation Engine. How? By combining Multi-Token Prediction (MTP) with Suffix Automaton (SA) decoding. This hybrid approach crushes production coding workloads, delivering 30%+ longer acceptance lengths on code editing tasks with zero added overhead. An open source version for TensorRT-LLM is now available to the community. Read the full engineering deep dive: baseten.co/blog/boosting-…
Baseten tweet media
English
1
5
29
13.8K
Pankaj Gupta 리트윗함
Baseten
Baseten@baseten·
"the best application layer companies set up the harness and how to use it for the problem that your user is trying to solve"
English
4
4
33
4K
Pankaj Gupta 리트윗함
Tuhin Srivastava
Tuhin Srivastava@tuhinone·
Baseten’s day 0 bet was that inference was the technology that would enable the best user experiences AI could deliver–fast, smart, reliable, secure. And that those experiences would rely not only on a handful of giant general intelligence models, but millions of specialized models built by companies for their specific customers and use cases. Whether you’re a doctor, developer, lawyer, mechanic, researcher, construction worker, marketer, etc, you’re accelerated by specialized tools worthy of your craft. To me, this is one of the most meaningful promises AI can deliver on. We’re starting to see it now. Many of the main-character AI companies on the application layer are built on highly-specialized models for highly-specialized workflows–Abridge, Clay, Cursor, OpenEvidence, Hebbia, Mercor, Notion–these businesses are booming because customers love specialized tools. There are probably hundreds of custom models in production today. Soon, there will be thousands and then millions. All enabled by a high-performing inference layer. Inference has emerged as one of the hardest problems in modern AI systems. Delivering reliable, low-latency experiences requires deep coordination across distributed infrastructure, kernel-level performance, and software ergonomics—even world-class teams struggle to do this well. As a result, as consumers and developers, we’ve grown to accept sluggish performance, frequent downtime, and inconsistent quality across both application companies and model providers. Meanwhile, the demands on inference are accelerating: AI adoption is trending towards ubiquity with reasoning models that are orders of magnitude more compute-intensive. This will only increase as more companies catch on to the virtues of owning their end-to-end IP rather than relying on black-box model APIs on shared infrastructure. Whether we can realize the impact of this generational shift will depend on our ability to serve these models reliably at scale. We knew we could make the technology work, but the biggest delight of it all has been seeing what our customers do with it. The (many-model) future is bright.
Baseten@baseten

We’re thrilled to announce that we have raised $300M at a $5B valuation. The round is led by IVP and CapitalG, both doubling down on their investment in Baseten, and joined by 01A, Altimeter, Battery Ventures, BOND, BoxGroup, Blackbird Ventures, Conviction, Greylock, and NVIDIA. Read more here: baseten.co/blog/announcin…

English
48
39
222
78.2K
Pankaj Gupta 리트윗함
Baseten
Baseten@baseten·
We’re thrilled to announce that we have raised $300M at a $5B valuation. The round is led by IVP and CapitalG, both doubling down on their investment in Baseten, and joined by 01A, Altimeter, Battery Ventures, BOND, BoxGroup, Blackbird Ventures, Conviction, Greylock, and NVIDIA. Read more here: baseten.co/blog/announcin…
Baseten tweet media
English
41
24
326
282.3K
Pankaj Gupta
Pankaj Gupta@defpan·
@baseten Incredibly proud of the team for turning hard optimization problems into real-world wins.
English
0
0
3
34
Pankaj Gupta 리트윗함
Baseten
Baseten@baseten·
Tired of waiting for video generation? Say less. We've optimized the Wan 2.2 runtime to hit: 3x faster inference on NVIDIA Blackwell, 2.5x faster on Hopper, 67% cost reduction. Read the full breakdown of our kernel optimizations and benchmarks here: #benchmarking-methodology" target="_blank" rel="nofollow noopener">baseten.co/blog/wan-2-2-v…
Baseten tweet media
English
4
3
19
1.9K