
Jamin Ball
6.4K posts

Jamin Ball
@jaminball
Altimeter Partner working with software businesses at the earliest stages of product market fit. Dad to 4 amazing kids. No investment advice, all views personal



Among the fastest DeepSeek V3.2, MiniMax-M2.5, and Qwen 3.5 397B inference in the market, per Artificial Analysis benchmarks (April 2026). ⚡️🤖 Sub-1-second TTFT. 230 tokens per second. Co-designed every layer of the stack with @Inferact, performance optimized @vllm_project, all on @NVIDIA HGX B300. Live on DigitalOcean Serverless Inference now. Full breakdown in the comments. ⬇️

Eli Lilly's deal with Profluent aims to go beyond CRISPR by using AI-designed enzymes to insert entire genes. It could reshape genetic medicine. trib.al/newhN5p



"Any feature we release, a competitor could release within two weeks." @MatanSF (@FactoryAI) on why the moat isn't software anymore. @dsa (@livekit) on building the framework for voice, video, and physical AI. @gsivulka (@HebbiaAI) on what it takes to win in vertical AI. They join @jason on This Week in AI, Episode 11: 00:00 Intro & AGI debate 03:30 Factory: autonomy for software engineering 04:29 LiveKit: open source to ChatGPT voice 10:31 Hebbia: AI for capital markets 13:21 SpaceX-Cursor $60 billion deal breakdown 26:28 Moats in the age of vibe coding 38:10 Deterministic agents vs. open chaos 45:56 DeepSeek V4 01:05:23 OpenAI's spend problem 01:12:08 P-doom scores

🏆 vLLM powers the fastest inference on NVIDIA Blackwell Ultra on Artificial Analysis. On @digitalocean's Serverless Inference, powered by vLLM on NVIDIA HGX B300: 🥇 AA #1 output speed for DeepSeek V3.2 (230 tok/s, 0.96s TTFT) and Qwen 3.5 397B 🔧 MiniMax-M2.5: 23% TPOT gain via an EAGLE3 draft model trained on TorchSpec Co-design highlights: - NVFP4 quantization on Blackwell Ultra - EAGLE3 + MTP speculative decoding - Per-model kernel fusion Thanks to @digitalocean, @nvidia, and @inferact for the collaboration. Optimizations land back in open-source vLLM. 🔗 digitalocean.com/blog/how-we-bu…

TEHRAN, April 29, 2026 -- Less than a week after the release of @deepseek_ai DeepSeek v4 Pro, the cracked team at @vllm_project and @inferact has achieved considerable improvement on GB200 (Dynamo+vLLM). This is largely due to the release of vLLM 0.20.0, which comes with MegaMoE kernel enabled for DEP deployments! Great work -- we are excited to highlight more improvements over the coming days.










