Jim Liu@jiahanjimliu
$IREN: FireworksAI, SemiAnalysis
Each neocloud comes into the industry with different advantages. IREN's hand is 2.91GW of power, a 2-3GW pipeline and enough IaaS expertise to partner with TogetherAI, Microsoft, and FireworksAI. Great companies grow their people, and FireworksAI is a key to IREN's growth in IaaS excellence. First I'll have to explain FireworksAI.
FireworksAI
FireworksAI is the commercialization of Meta's AI platform team for ads. Although Meta is not doing well in LLMs, it's ads platform is the best in the world. Without the layers of bureaucracy at Meta, FireworksAI will be a clean rendition of Meta AI Infra with the knowledge of many of the key engineers. Software developers know that a clean rewrite will take the code base to a whole new level. The AI platform that powered Meta's ads and recommendations will be now available to the startups and F500s of the world.
FireworksAI Cofounders
Lin Qiao, the CEO, is was the Senior Director that led Meta's AI frameworks & platform org which powered Meta's family of applications, integrity, ads and newsfeed (1). She worked all the way down to the compiler level and deployment across Meta's datacenters (1).
Benny Chen was a Principal Engineer at Meta that led Ads Infrastructure at Meta and built low latency infrastructure to serve billions of users and trillions of impressions (2).
Chenyu Zhao was a Senior Staff Engineer at Google that led a 50 person engineering org for Google Cloud Vertex AI. He owned ML algorithms and MLOps (3).
Dmytro Dzhulgakov, the CTO at FireworksAI, was a Meta VP that owned Pytorch, led 400 engineers, and is a core maintainer of Pytorch (4).
Dmytro Ivchenko was a Principal Engineer at Meta who was the ranking lead (5). Ranking at the heart of Meta because it drives Meta's engagement and revenue engine.
James Reed was the Staff Engineer that led PyTorch distributed compilers (6). He is the key person to maximizing performance at the fundamental levels of GPUs.
Pawel Garbacki was the Principal Engineer at Meta who owned GenAI research, fine-tuning, alignment, multi-agent systems & multi-modality (7). In other words, he leads Meta Newsfeed ML. My Meta friend tells me, GenAI org is top priority for GPU allocation.
FireworksAI Engineering
Their blog is insane; if you're a software engineer, I recommend you take a look (8). A key example their Inference Serving Runtime: FireAttention which serves LLM with 40-60% less latency than vLLM (9, 10)! vLLM is the one of the most important open source projects in LLM-serving right now and is used by $CRWV and $NBIS (11, 12).
Cursor, an leading ai coding agent used at Nvidia, uses FireworkAI for their Fast Apply Autocomplete Feature where speed is critical (13).
Fireworks raised $250m series C at a $4B evaluation as pure AI PaaS (14) and currently serves Gitlab, Uber, Verizon, Notion, Doordash, Hubspot, Quora along with many silicon valley statups (15).
Synergy with IREN
As a pure AI PaaS company, FireworksAI has chosen IREN to be their IaaS provider in addition to AWS, OCI, GCP, (17, 18, 19). IREN is going to provide them better pricing than the hypers which would critical to their costs as they scale to more customers. Therefore it's in Fireworks incentive to make their top performing PaaS stack runs just as fast on IREN's GPUs.
FireworksAI is differentiated by performance, performance is easy for them, they need GPUs. When IREN works with FireworksAI, their team will learn how to optimize performance at the IaaS level. It will be a win-win for both parties.
SemiAnalysis
SemiAnalysis measured IREN's performance on IREN's old H100 XE9680s dell servers (20) which Dell did not do well early on which let SMCI take marketshare but Dell is now on-par with SMCI.
Now, whatever performance IREN is still lacking, FireworksAI will iron them out. FireworksAI must have done performance testing already on IREN's servers to choose them because performance is the core of their business.
FireworksAI brings optimized software and will help IREN optimize the networking fabric, storage interfaces, firmware/driver stacks, and telemetry tools to replace hardware before they wear out. IREN must have decent uptime now for MSFT to put a $2B downpayment on them but FireworksAI will help them to get to the next level because FireworksAI themselves "processes over 140 billion tokens daily with 99.99% API uptime, so our customers never experience interruptions" (21). Nebius only has 99.9% uptime (22) so FireworksAI is an entire digit ahead of them.
Do you trust SemiAnalysis, an "research company" or FireworksAI?
SemiAnalysis has done one thing right, put Analysis in their name instead of Research. RittenHouse Research and Culper Research should take note: using Analysis instead of Research in your name helps you hide your bias better.