
Michael Feil
111 posts

Michael Feil
@feilsystem
Accelerating LLMs @Basetenco - long-context and embedding inference (https://t.co/IdBf5U7mS3) - opinions are my own.




Introducing RadixMLP: intra-batch prefix deduplication for 1.4–5x faster prefill. Tokens with identical prefixes (like system prompts or shared queries) produce identical activations. @feilsystem developed RadixMLP to eliminate this redundancy, then open-sourced it and added it to TEI and BEI. baseten.co/resources/rese…







Today, we’re excited to announce our $150M Series D, led by BOND, with Jay Simons joining our Board. We’re also thrilled to welcome Conviction and CapitalG to the round, alongside support from 01 Advisors, IVP, Spark Capital, Greylock Partners, Scribble Ventures, and Premji Invest. The last eighteen months have been a whirlwind; as the AI application layer has taken off, we've been proud to play a small part supporting world class companies run their production workloads. Thanks to all our customers including Abridge, Bland, Clay, Gamma, Mirage, OpenEvidence, Sourcegraph, WRITER, and Zed Industries. We’re just getting started. If you’re building the next generation of AI products, we’d love to work with you.




It's time again for our last (now yearly) celebration extravaganza of the year. GPU MODE is meeting IRL again in downtown San Francisco on Friday October 24 from 10am to 10pm to hack all day

We're excited to be an OpenAI launch partner for the release of GPT OSS 120B and 20B! Model APIs coming shortly, with performance optimizations, benchmarks, and vibe checks dropping throughout the day. Stay tuned.







