LLM Evals Workshop @NeurIPS

39 posts

LLM Evals Workshop @NeurIPS banner
LLM Evals Workshop @NeurIPS

LLM Evals Workshop @NeurIPS

@LLM_eval

NeurIPS 2025 Workshop. Evaluating the Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling

San Diego 가입일 Temmuz 2025
19 팔로잉216 팔로워
LLM Evals Workshop @NeurIPS 리트윗함
Berivan Isik
Berivan Isik@BerivanISIK·
“Good researchers obsess over evals” by @natolambert @LLM_eval workshop!
Berivan Isik tweet mediaBerivan Isik tweet media
English
0
5
60
4.6K
LLM Evals Workshop @NeurIPS 리트윗함
Berivan Isik
Berivan Isik@BerivanISIK·
@LLM_eval workshop has started with Orhan Firat’s talk at Upper Level Room 2. @NeurIPSConf
Berivan Isik tweet mediaBerivan Isik tweet media
English
2
3
33
4.5K
LLM Evals Workshop @NeurIPS 리트윗함
Nathan Lambert
Nathan Lambert@natolambert·
Good researchers obsess over evals The story of Olmo 3 (post-training), told through evals NeurIPS Talk tomorrow. Upper Level Room 2, 10:35AM.
Nathan Lambert tweet media
English
11
45
598
56.6K
LLM Evals Workshop @NeurIPS 리트윗함
Berivan Isik
Berivan Isik@BerivanISIK·
I’ll be @NeurIPSConf all week and would love to connect on LLM data, evaluation, benchmarking, and scaling laws. If you’re working on related problems, feel free to reach out. PS: Don’t miss our one-of-a-kind workshop on LLM evaluation: sites.google.com/view/llm-eval-…
English
6
4
99
9.2K
LLM Evals Workshop @NeurIPS 리트윗함
Huanxin Sheng
Huanxin Sheng@HuanxinShe5254·
I will present my #EMNLP2025 paper at the #NeurIPS2025 LLM Eval Workshop @LLM_eval (Dec. 7th 11:15 - 12:15Poster Session 2). If you are interested in reliable LLM-as-a-judge, please come say hi! ☕️ #AI #LLM #LLMJudge #LLMEvaluation #ConformalPrediction
Huanxin Sheng@HuanxinShe5254

🤩My FIRST paper received #EMNLP2025 SAC Highlights: "Analyzing Uncertainty of LLM-as-a-Judge: Interval Evaluations with Conformal Prediction" Huge thanks to my advisor @jiank_uiuc and collaborators Xinyi Liu, @hangfeng_he , & @jieyuzhao11 ! #AI #NLP #LLM #ConformalPrediction

English
0
2
12
997
LLM Evals Workshop @NeurIPS 리트윗함
Riccardo Cadei
Riccardo Cadei@riccardocadeii·
The Narcissus Hypothesis: --Recursive training on semi-synthetic corpora enforcing human alignment induces a Social Desirability Bias: world-models (Narcissus) aim to please rather than represent, polluting data lakes and charming us (Echo) into hanging on their every word.
Riccardo Cadei tweet media
English
1
4
7
1.1K