Aakash Gupta@aakashgupta
The entire AI industry spent a week convinced DeepSeek had secretly launched V4. Reuters reported it. Developers debated it. OpenRouter usage charts broke.
It was Xiaomi.
A smartphone and electric vehicle company just shipped a 1-trillion-parameter model that topped the world's largest API aggregation platform, and nobody guessed the origin because the model was too good to be associated with a hardware company.
The stealth launch as "Hunter Alpha" on March 11 was the most elegant product validation in recent AI history. No brand, no attribution, no expectations. Just raw performance. The model processed over 1 trillion tokens in 8 days. Developers organically chose it over every labeled frontier model on the platform. When Reuters tested the chatbot, it identified itself only as "a Chinese AI model primarily trained in Chinese" with a May 2025 knowledge cutoff, the exact same cutoff DeepSeek reports.
The person behind this is Luo Fuli. Born in 1995. Eight papers at ACL as a graduate student at Peking University. Alibaba DAMO Academy. Then DeepSeek, where she co-developed V2 and contributed to R1. Lei Jun reportedly offered tens of millions of yuan to recruit her. She joined Xiaomi in November 2025. Four months later, she's shipping a model that benchmarks alongside Claude Sonnet 4.6 and GPT-5.2 at one-fifth the API cost.
The detail that tells you everything about how this team operates: when Luo first experienced a complex agentic scaffold, she tried to convince the MiMo team to adopt it. They resisted. So she issued a mandate. Anyone on the team with fewer than 100 conversations with the system by tomorrow can quit. They all stayed. The imagination converted into research velocity.
The architectural bets matter. Hybrid Attention for long-context efficiency. MTP inference for low latency. 1M context window. 42B activated parameters out of 1T total. These are infrastructure decisions optimized for agents that run autonomously for hours, not chatbots that answer one question at a time.
Pricing: $1/$3 per million tokens up to 256K context. $2/$6 for 256K to 1M. Claude Sonnet 4.6 costs roughly 5x that. Xiaomi's shares rose 5.8% on the announcement.
The real DeepSeek V4 still hasn't shipped. The model everyone mistook for it already has a trillion tokens of real-world usage data.