
📏 "UNCLE" by @caiqizh - New benchmark for evaluating uncertainty expression in long-form generation
aclanthology.org/2025.emnlp-mai…
English
CambridgeLTL
266 posts

@CambridgeLTL
Language Technology Lab (LTL) at the University of Cambridge. Computational Linguistics / Machine Learning / Deep Learning. Focus: Multilingual NLP and Bio NLP.




Can AI simulate human behavior? 🧠 The promise is revolutionary for science & policy. But there’s a huge "IF": Do these simulations actually reflect reality? To find out, we introduce SimBench: The first large-scale benchmark for group-level social simulation. (1/9)







