Gabrielle Kaili-May Liu

50 posts

Gabrielle Kaili-May Liu banner
Gabrielle Kaili-May Liu

Gabrielle Kaili-May Liu

@pybeebee

PhD Student in Computer Science @Yale

เข้าร่วม Eylül 2015
148 กำลังติดตาม92 ผู้ติดตาม
Gabrielle Kaili-May Liu
Gabrielle Kaili-May Liu@pybeebee·
Traveling to #EMNLP2025 to present our work “MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs” this week! 🇨🇳 Come by & let's chat about LLM faithfulness / uncertainty / calibration! 📍Poster Session 7 @ Hall C3 🗓Fri, Nov 7 @ 2-3:30p 🔗tinyurl.com/metafaith
Gabrielle Kaili-May Liu@pybeebee

🎉 Delighted to announce that MetaFaith has been accepted to #EMNLP2025 Main! In this work we systematically study how well LLMs can express their internal uncertainty in words, offering a metacognition-inspired way to improve this ability 🧠✨ Check out more details below!👇

English
1
1
10
5.3K
Gabrielle Kaili-May Liu
Gabrielle Kaili-May Liu@pybeebee·
🚀 RAG systems excel at answering questions—but what happens when the corpus has NO answer or complex multi-hop reasoning is required? Moreover, how can we build benchmarks to stress-test RAG systems in such settings in a realistic way? See our new preprint to find out! 🧵👇
Gabrielle Kaili-May Liu tweet media
English
1
3
5
709
Gabrielle Kaili-May Liu
Gabrielle Kaili-May Liu@pybeebee·
Excited to present this at #EMNLP2025 in just over a month! It turns out that even flagship models like GPT-5 still struggle at faithfully expressing uncertainty 🤔 📊 Full results for the newest models are now live👇 arxiv.org/abs/2505.24858
Gabrielle Kaili-May Liu@pybeebee

🎉 Delighted to announce that MetaFaith has been accepted to #EMNLP2025 Main! In this work we systematically study how well LLMs can express their internal uncertainty in words, offering a metacognition-inspired way to improve this ability 🧠✨ Check out more details below!👇

English
0
1
6
470
Gabrielle Kaili-May Liu รีทวีตแล้ว
Alan Li
Alan Li@alanli2020·
1/9 🚀 New paper: Demystifying Scientific Problem-Solving in LLMs — How does reasoning enhancement affect knowledge recall, and do LLMs benefit from external knowledge complimentary to reasoning? Tldr; 📊 SciReas: holistic and efficient evaluation suite for scientific reasoning 🧠 KRUX: a novel framework to study knowledge vs reasoning in LLMs 🔑 Findings: knowledge is a bottleneck; reasoners + in-context knowledge help; long CoT helps knowledge recall/utilization
Alan Li tweet media
English
1
2
16
4.6K
Gabrielle Kaili-May Liu
Gabrielle Kaili-May Liu@pybeebee·
🎉 Delighted to announce that MetaFaith has been accepted to #EMNLP2025 Main! In this work we systematically study how well LLMs can express their internal uncertainty in words, offering a metacognition-inspired way to improve this ability 🧠✨ Check out more details below!👇
Gabrielle Kaili-May Liu@pybeebee

🔥 Excited to share MetaFaith: Understanding and Improving Faithful Natural Language Uncertainty Expression in LLMs🔥 How can we make LLMs talk about uncertainty in a way that truly reflects what they internally "know"? Check out our new preprint to find out! Details in 🧵(1/n):

English
0
2
11
2K
Gabrielle Kaili-May Liu
Gabrielle Kaili-May Liu@pybeebee·
I will be presenting our work 𝗠𝗗𝗖𝘂𝗿𝗲 at #ACL2025NLP in Vienna this week! 🇦🇹 Come by if you’re interested in multi-doc reasoning and/or scalable creation of high-quality post-training data! 📍 Poster Session 4 @ Hall 4/5 🗓️ Wed, July 30 | 11-12:30 🔗 aclanthology.org/2025.acl-long.…
Gabrielle Kaili-May Liu@pybeebee

🔥Thrilled to introduce MDCure: A Scalable Pipeline for Multi-Document Instruction-Following 🔥 How can we systematically and scalably improve LLMs' ability to handle complex multi-document tasks? Check out our new preprint to find out! Details in 🧵 (1/n):

English
1
4
28
3K
Gabrielle Kaili-May Liu รีทวีตแล้ว
Sophia S. Han
Sophia S. Han@HanSineng·
Excited to see more investigation into LLM creativity. We have some pioneering work on this topic as well: Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models. arxiv.org/pdf/2505.10844.
Yiyou Sun@YiyouSun

🚨 New study on LLM's reasoning boundary! Can LLMs really think out of the box? We introduce OMEGA—a benchmark probing how they generalize: 🔹 RL boosts accuracy on slightly harder problems with familiar strategies, 🔹 but struggles with creative leaps & strategy composition. 👇

English
0
5
17
3.6K
Gabrielle Kaili-May Liu
Gabrielle Kaili-May Liu@pybeebee·
🔥 Excited to share MetaFaith: Understanding and Improving Faithful Natural Language Uncertainty Expression in LLMs🔥 How can we make LLMs talk about uncertainty in a way that truly reflects what they internally "know"? Check out our new preprint to find out! Details in 🧵(1/n):
Gabrielle Kaili-May Liu tweet media
English
2
4
12
3.3K