Zae Myung Kim

18 posts

Zae Myung Kim banner
Zae Myung Kim

Zae Myung Kim

@zaemyung

Katılım Şubat 2022
55 Takip Edilen39 Takipçiler
Sabitlenmiş Tweet
Zae Myung Kim
Zae Myung Kim@zaemyung·
🚨 New Paper Alert! 🚨 How can we align language models without drowning in prompt engineering or falling into reward hacking traps? We introduce Meta Policy Optimization (MPO)—a new reinforcement learning framework that evolves its own reward model rubrics through meta-level reflection. Inspired by metacognition and evaluative thinking, MPO trains models to think about how they evaluate, not just what they generate. 🔥 Why it matters: ✔️ Boosts stability and robustness in RLAIF ✔️ Reduces human labor in prompt crafting ✔️ Generalizes across tasks: essays, summarization, ethical and mathematical reasoning Check it out: huggingface.co/papers/2504.20… Big thanks to co-authors @chanwoopark20 (MIT), @_vipulraheja (Grammarly), and @dongyeopkang (UMN)! #AI #LLMs #ReinforcementLearning #MetaLearning #NLP #Alignment #RLHF #RLAIF #EvaluativeThinking #PromptEngineering
Zae Myung Kim tweet mediaZae Myung Kim tweet mediaZae Myung Kim tweet media
English
0
5
14
911
Chanwoo Park
Chanwoo Park@chanwoopark20·
codex 5.3 is just super wild -- very different with codex 5.2...
English
1
0
3
557
Zae Myung Kim
Zae Myung Kim@zaemyung·
Our empirical findings indicate that robustness against paraphrasing attacks arises from the preservation of higher-level discourse structures, despite variations at the sentence level.
Zae Myung Kim tweet mediaZae Myung Kim tweet media
English
1
0
1
253
Zae Myung Kim
Zae Myung Kim@zaemyung·
"Paraphrasing attacks" can compromise the effectiveness of AI content detectors. 🙀 Can hierarchical structures in texts help build a more robust detector? Our research reveals a resounding💡YES!💡Delighted to share our work on merging discourse frameworks with graph analysis.
Zae Myung Kim tweet media
English
1
5
25
5.3K
Zae Myung Kim retweetledi
Debarati Das
Debarati Das@geekylildeb·
🚀Excited to share MinnesotaNLP's FIRST lab-wide paper (15+ team) on artifacts present in LLM-generated data! We explore the diverse world of LLM-generated text content and its impact on the artificial data ecosystem. #NLProc #syntheticdata #LLM ArXiV: arxiv.org/abs/2401.14698
Debarati Das tweet media
English
1
10
21
5K
Zae Myung Kim retweetledi
Ryan Koo
Ryan Koo@im_kooryan·
LLMs have proven to outperform humans on a multitude of tasks. Does this also mean they are more biased too? In our work, we benchmark several different LLMs as automatic evaluators for various cognitive biases. arxiv.org/abs/2309.17012
Ryan Koo tweet media
English
1
5
25
17.8K
Zae Myung Kim
Zae Myung Kim@zaemyung·
We also augmented the training dataset with datasets from other relevant tasks, for example, "Lang-8" for improving fluency. We found out that many of these data were "meaning-changed" edits that were more toward generation than revision, and thus we filtered them accordingly.
English
1
0
1
0
Dongyeop Kang (DK)
Dongyeop Kang (DK)@dongyeopkang·
Our Minnesota NLP group ia getting bigger and diverse! ❤️❤️
Dongyeop Kang (DK) tweet mediaDongyeop Kang (DK) tweet media
English
2
1
68
0
Zae Myung Kim retweetledi
Jessy Li
Jessy Li@jessyjli·
Excited to share DCQA (Discourse Comprehension by Question Answering), a scalable data collection framework + 22K training data capturing semantic and discursive relationships between sentences via free-form questions and their answers arxiv.org/abs/2111.00701 (1/2)
Jessy Li tweet mediaJessy Li tweet media
English
3
21
101
0