

Can today's LLMs truly understand you, not just your words? 🤖❤️ Introducing SAGE: Sentient Agent as a Judge — the first evaluation framework that uses sentient agents to simulate human emotional dynamics and inner reasoning for assessing social cognition in LLM conversations. 🧠 We propose an automated "sentient-in-the-loop" framework that stress-tests an LLM's ability to read emotions, infer hidden intentions, and reply with genuine empathy. 🤝 Across 100 supportive-dialogue scenarios, sentient emotion scores strongly align with human-centric measures (BLRI: r = 0.82; empathy metrics: r = 0.79), confirming psychological validity. 📈 The Sentient Leaderboard reveals significant ranking differences from conventional leaderboards (like Arena), showing that top "helpful" models aren't always the most socially adept. 🏆 Advanced social reasoning doesn’t require verbosity — the most socially adept LLMs achieve empathy with surprisingly efficient token usage! Code: github.com/tencent/digita… 🧑💻 Paper: dx.doi.org/10.13140/RG.2.… 🧵

