
Dayoon Ko
35 posts

Dayoon Ko
@dayoon12161
M.S/Ph.D integrated student in CSE @SeoulNatlUni | Research Intern @LG_AI_Research




❓Do LLMs maintain the capability of knowledge acquisition throughout pretraining? If not, what is driving force behind it? ❗Our findings reveal that decreasing knowledge entropy hinders knowledge acquisition and retention as pretraining progresses. 📄arxiv.org/abs/2410.01380














🚨 Announcing Generalized Correctness Models (GCMs) 🚨Finding that LLMs have little self knowledge about their own correctness, we train an 8B GCM to predict correctness of many models, which is more accurate than training model-specific CMs, and outperforms a larger Llama-3-70B’s self-emitted confidences in downstream selective prediction tasks. We motivate GCMs and analyze them by answering 2 questions: ❓ RQ1: Are LLMs better than other LLMs at predicting their own correctness? We find that they are not, instead historical information (past LLM outputs and their correctness) drives performance, motivating cross-model transfer and training of GCMs! ❓ RQ2: How can we use historical information from multiple models for correctness prediction? Within RQ2, we explore 3 further subquestions, informing the design of GCMs: 1⃣ How does confidence prediction generalize across models? GCMs transfers strategies across models and datasets, even beating models trained directly on OOD datasets. 2⃣ What information should GCMs condition on? The exact way an LLM phrases an answer is a strong predictor for correctness + strategies leveraging world-knowledge seem to drive generalization. 3⃣ How do alternative methods for encoding history (e.g. post hoc calibration, ICL) compare? Including historical information ICL can aid larger models to predict correctness but underperforms GCMs, and post hoc calibration can complement GCMs to reduce calibration error. 🧵👇

🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.








🎉Our paper "Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates" is accepted to #ACL2025 Main!🎉 We introduce a benchmark for multimodal "deception" + LLM-based diversified attack. 🚀 Preprint coming soon!




