Sabitlenmiş Tweet

Intelligence should be defined by the people closest to the work. Intelligence should be owned by all of us.
Let’s build a many model future!
Tuhin Srivastava@tuhinone
English
Baseten
2.3K posts

@baseten
Inference is everything.


We serve Qwen3-TTS on vLLM-Omni at $3 per 1M characters. That's 90% lower in cost than comparable closed-source TTS APIs. Our engineers optimized a single-replica serving stack to get there. Details on the optimized stack and cost per concurrent stream here.





















