Zhiwei He
237 posts


From 0 to 0100.HK: MiniMax is officially listed on the Hong Kong Stock Exchange. Built on one belief: Intelligence with Everyone. We believe advanced AI should be accessible and beneficial to a wide range of users and industries. This guiding principle underscores everything we do: from research and development to how we engage with our users and partners. This isn’t the finish line. It’s fuel for our next leap toward AGI. Thank you to the millions of users & developers building and innovating with us every day ❤️ #MiniMax #IntelligenceWithEveryone #IPO














Can today's LLMs truly understand you, not just your words? 🤖❤️ Introducing SAGE: Sentient Agent as a Judge — the first evaluation framework that uses sentient agents to simulate human emotional dynamics and inner reasoning for assessing social cognition in LLM conversations. 🧠 We propose an automated "sentient-in-the-loop" framework that stress-tests an LLM's ability to read emotions, infer hidden intentions, and reply with genuine empathy. 🤝 Across 100 supportive-dialogue scenarios, sentient emotion scores strongly align with human-centric measures (BLRI: r = 0.82; empathy metrics: r = 0.79), confirming psychological validity. 📈 The Sentient Leaderboard reveals significant ranking differences from conventional leaderboards (like Arena), showing that top "helpful" models aren't always the most socially adept. 🏆 Advanced social reasoning doesn’t require verbosity — the most socially adept LLMs achieve empathy with surprisingly efficient token usage! Code: github.com/tencent/digita… 🧑💻 Paper: dx.doi.org/10.13140/RG.2.… 🧵







🚀 Introducing ROLL: An Efficient and User-Friendly RL Training Framework for Large-Scale Learning! 🔥 Efficient, Scalable & Flexible – Train 200B+ models with 5D parallelism (TP/PP/CP/EP/DP), seamless vLLM/SGLang switching, async multi-env rollout for maximum RL throughput!















Confused about recent LLM RL results where models improve without any ground-truth signal? We were too. Until we looked at the reported numbers of the Pre-RL models and realized they were serverely underreported across papers. We compiled discrepancies in a blog below🧵👇


