MiyoungKo

30 posts

MiyoungKo

@miyoung_ko

Ph.D student at KAIST AI.

Katılım Ekim 2020

84 Takip Edilen223 Takipçiler

MiyoungKo retweetledi

Hoyeon Chang@hoyeon_chang·2 Mar

1/ Pattern matching has been left ambiguous for so long. Our precise formalization pinpoints its predictive power on composition tasks! 📣 Check out our ICLR 2026 paper: 1️⃣ Sharp sample complexity bound for 2-hop task; 2️⃣ Some unexpectedly hard composition tasks: CoT won’t help.

English

5.3K

MiyoungKo retweetledi

Haebin Shin@haebinshin_·2 May

🚨 Our new paper from @MSFTResearch accepted at ICML 2025! Can LLaMA be distilled from DeepSeek in logit space, even with completely different vocabularies? We presents: “Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher-Guided Language Modeling” [1/5]

English

1.5K

MiyoungKo retweetledi

Seungone Kim@seungonekim·25 Nis

🏆Glad to share that our BiGGen Bench paper has received the best paper award at @naaclmeeting! x.com/naaclmeeting/s… 📅 Ballroom A, Session I: Thursday May 1st, 16:00-17:30 (MDT) 📅 Session M (Plenary Session): Friday May 2nd, 15:30-16:30 (MDT) 📅 Virtual Conference: Tuesday May 6th, 20:30-21:00 (MDT) I'd like to appreciate our coauthors @scott_sjy Ji Yong Cho @ShayneRedford @chaechaek1214 @dongkeun_yoon @gson_AI Yejin Cho @shafayat_sheikh @jinheonbaek @suehpark @ronalhwang @Jinkyung_Jo Hyowon Cho @haebinshin_ @sylee_ai @hanseok_oh @nlee288 @itsnamgyu @joocjun @miyoung_ko @yoonjoo_le2 @hyungjoochae @jay_shin @jang_yoel @SeonghyeonYe @billyuchenlin @wellecks @gneubig Moontae Lee @Kyungjae__Lee @seo_minjoon! It wouldn't have been possible without everyone's feedback and hard work 😀

Seungone Kim@seungonekim

🤔How can we systematically assess an LM's proficiency in a specific capability without using summary measures like helpfulness or simple proxy tasks like multiple-choice QA? Introducing the ✨BiGGen Bench, a benchmark that directly evaluates nine core capabilities of LMs.

English

131

31.7K

MiyoungKo retweetledi

Haebin Shin@haebinshin_·1 Ara

🚨 New paper alert! 🚨 Isn’t it wasteful to repeat lengthy & complex agent prompts every time? Introducing "Generative Context Distillation"—a new lightweight method to internalize prompt. 🦾 Powerful performance 💵 Efficient inference "without the need for a prompt📜" [1/7]

English

6.7K

MiyoungKo@miyoung_ko·11 Kas

I'll be attending #EMNLP2024 to present "Hierarchical Deconstruction of LLM Reasoning: A Graph-based Framework on Analyzing Knowledge Utilization" 📍In-Person Poster Session D (Jasmine) 🕐 Nov. 13 (Wed.), 10:30 AM - 12:00 AM 📕arxiv.org/abs/2406.19502 Would love to chat about reasoning, interpretability, and factuality 🌴

English

3.4K

MiyoungKo retweetledi

Seonghyeon Ye@SeonghyeonYe·16 Eki

🚀 First step to unlocking Generalist Robots! Introducing 🤖LAPA🤖, a new SOTA open-sourced 7B VLA pretrained without using action labels. 💪SOTA VLA trained with Open X (outperforming OpenVLA on cross and multi embodiment) 😯LAPA enables learning from human videos, unlocking potential for robotic foundation model ❗Over 30x pretraining efficiency for VLA training 🤗Code and checkpoints are all open-sourced!

English

214

33.2K

MiyoungKo retweetledi

Jiyeon Kim@jiyeonkimd·7 Eki

❓Do LLMs maintain the capability of knowledge acquisition throughout pretraining? If not, what is driving force behind it? ❗Our findings reveal that decreasing knowledge entropy hinders knowledge acquisition and retention as pretraining progresses. 📄arxiv.org/abs/2410.01380

English

153

22.7K

MiyoungKo retweetledi

Seongyun Lee@sylee_ai·25 Eyl

I’m happy to share our Janus work has been accepted to #NeurIPS2024 ! Thank you for co-leading this project @suehpark 🥳 See you in Vancouver 🇨🇦

Seongyun Lee@sylee_ai

🚨 New LLM personalization/alignment paper 🚨 🤔 How can we obtain personalizable LLMs without explicitly re-training reward models/LLMs for each user? ✔ We introduce a new zero-shot alignment method to control LLM responses via the system message 🚀

English

4.8K

MiyoungKo@miyoung_ko·21 Eyl

Glad to share our work regarding depth-wise reasoning was accepted by #EMNLP2024 Main track!

MiyoungKo@miyoung_ko

📢 Excited to share our latest paper on the reasoning capabilities of LLMs! Our research dives into how these models recall and utilize factual knowledge during solving complex questions. [🧵1 / 10] arxiv.org/abs/2406.19502

English

MiyoungKo@miyoung_ko·1 Tem

A huge thanks to my collaborators @suehpark (co-lead💪🏻), @Joninneverland, @seo_minjoon for their essential contributions to this research 😊 [🧵10 / 10]

English

806

MiyoungKo@miyoung_ko·1 Tem

🚀 Check out our paper for a deep dive into the reasoning capabilities of LLMs and how structured approaches can enhance their performance on complex tasks. [🧵9 / 10] - Paper: arxiv.org/abs/2406.19502 - Code: github.com/kaistAI/knowle… - Dataset: huggingface.co/datasets/kaist…

English

1.1K

MiyoungKo@miyoung_ko·1 Tem

🪜 To enhance reasoning, we guide models through structured, multi-turn interactions, consistently improving performance of most models. This highlights the importance of intermediate steps in complex reasoning. [🧵8 / 10]

English

675

MiyoungKo@miyoung_ko·1 Tem

🔬 We further investigate the reliance on memorization of training data, showing less reliance on solving D3 questions than simple factual recall (D1). Through this, we also reveal that forward and backward discrepancies in large models originate from distinct types of failures (25% vs. 75% of 70B models, higher % memorizes more). [🧵7 / 10]

English

840

MiyoungKo@miyoung_ko·1 Tem

📊 We tested various instruction-tuned LLMs like LLaMA 2, LLaMA 3, Mistral, and Mixtral, ranging from 7B to 70B parameters. Our findings show smaller models have larger discrepancies, while larger models struggle from discrepancies less. [🧵6 / 10]

English

699

MiyoungKo@miyoung_ko·1 Tem

English

298

48.1K

MiyoungKo@miyoung_ko·1 Tem

📐 Now how to evaluate? Our method involves measuring forward reasoning gaps (differences in performance on simpler vs. complex questions) and backward discrepancies (success on complex questions but struggles with simpler ones). This gives a comprehensive view of LLMs' reasoning skills. [🧵5 / 10]

English

737

MiyoungKo@miyoung_ko·1 Tem

📚 Introducing DepthQA: a collection of questions and answers derived from human-written scientific D3 questions in the TutorEval dataset. D2 and D1 questions are generated using GPT-4 Turbo, under a carefully designed construction pipeline and postprocessing in order to ensure accuracy of the content and reasoning graph representation. Various types of reasoning processes are observed! [🧵4 / 10]

English

787

MiyoungKo@miyoung_ko·1 Tem

⛓️ To tackle this, we propose breaking down complex questions into a graph structure, with each node representing a specific depth of understanding: recall (D1), application (D2), and strategic thinking (D3). Our approach emphasizes accumulating and integrating knowledge to solve real-world problems effectively. [🧵3 / 10]

English

947

MiyoungKo@miyoung_ko·1 Tem

❓Complex questions often require various reasoning processes beyond simply aggregating facts. Take this question as an example: "Why does ReLU training take less time than sigmoid or tanh training?". One must not only *recall* what an activation function is, but also *compare* the characteristics of activation functions and understand the *causal* relationship between gradients and training speed. [🧵2 / 10]

English

Keşfet

@MSFTResearch @naaclmeeting @scott_sjy @ShayneRedford @chaechaek1214 @dongkeun_yoon @gson_AI @shafayat_sheikh