Usman Gohar
292 posts

Usman Gohar
@UsmanGohar
Ph.D. student @IowaStateU ML Fairness, AI/Software Safety, AI Ethics





🧪 Your LLM evaluation results could help the whole field 🚀 🧑🔬 Our ACL Shared task is out! We’re building a unified, crowdsourced database to create a common language for AI evaluation reporting. And we need your data. (1/2) evalevalai.com/events/shared-…

🚨 The next edition of EvalEval Workshop is coming to @aclmeeting 2026! 🧠 Workshop on "AI Evaluation in Practice: Bridging Research, Development, and Real-World Impact" 🎇 📢 CFP is now open!!! More details ⏬ 📍 San Diego 📝 Submission deadline: Mar 12, 2026

Today we’re releasing the International AI Safety Report 2026: the most comprehensive evidence-based assessment of AI capabilities, emerging risks, and safety measures to date. 🧵 (1/17)



Today we’re releasing the International AI Safety Report 2026: the most comprehensive evidence-based assessment of AI capabilities, emerging risks, and safety measures to date. 🧵 (1/17)






EvalEval is back! Our view today for the 2025 EvalEval Workshop at the beautiful @UCSD campus. We have an exciting program planned, full of wonderful discussions and people on all things evals Agenda: evalevalai.com/events/worksho… Can't be here? Join us live: meet.google.com/ozx-dsnz-gcr?h…






✨ Weekly AI Evaluation Paper Spotlight ✨ What if the average performance scores we trust are actually hiding a benchmark’s flaws? 📰“The Flaw of Averages: Quantifying Uniformity of Performance on Benchmarks” (@aardauzunoglu, @tli104, @DanielKhashabi) introduces HARMONY. 1/n

If you're at NeurIPS, you don't want to miss out!! We have an amazing program and line-up of speakers and panelists! ⏰Last week to submit your abstracts! See more: evalevalai.com/events/worksho…

An LLM-generated paper is in the top 17% of ICLR submissions in terms of average reviewer score, having received two 8's. The paper has tons of BS jargon and hallucinated references. Fortunately, one reviewer actually looked at the paper and gave it a zero. 1/3

