Bang Zhang retweetledi

Can today's LLMs truly understand you, not just your words? 🤖❤️
Introducing SAGE: Sentient Agent as a Judge — the first evaluation framework that uses sentient agents to simulate human emotional dynamics and inner reasoning for assessing social cognition in LLM conversations.
🧠 We propose an automated "sentient-in-the-loop" framework that stress-tests an LLM's ability to read emotions, infer hidden intentions, and reply with genuine empathy.
🤝 Across 100 supportive-dialogue scenarios, sentient emotion scores strongly align with human-centric measures (BLRI: r = 0.82; empathy metrics: r = 0.79), confirming psychological validity.
📈 The Sentient Leaderboard reveals significant ranking differences from conventional leaderboards (like Arena), showing that top "helpful" models aren't always the most socially adept.
🏆 Advanced social reasoning doesn’t require verbosity — the most socially adept LLMs achieve empathy with surprisingly efficient token usage!
Code: github.com/tencent/digita… 🧑💻
Paper: dx.doi.org/10.13140/RG.2.… 🧵

English
