KAIST AI

126 posts

KAIST AI banner
KAIST AI

KAIST AI

@KAIST_AI

The Kim Jaechul Graduate School of AI at KAIST

Seoul, Republic of Korea Katılım Mart 2022
210 Takip Edilen2.1K Takipçiler
KAIST AI retweetledi
Woongyeong Yeo
Woongyeong Yeo@wgcyeo·
📢 New preprint out on contextual integrity (CI) and a new Product-of-Experts (PoE) view of self-distillation! Introducing SelfCI, a novel self-distillation framework that operationalizes CI by optimizing for the intersection of task utility and minimal disclosure. 🧵👇
Woongyeong Yeo tweet media
English
1
10
25
2.2K
KAIST AI retweetledi
NEC Laboratories Europe
Our work shows that using reasoning models as evaluators improves evaluation quality with additional test-time compute, enabling stronger re-ranking of #lanugagemodel outputs & matching the gains of increased compute at generation time. Learn how: #publications-2202" target="_blank" rel="nofollow noopener">neclab.eu/research-group… #NECLabs
NEC Laboratories Europe tweet media
English
0
2
3
128
KAIST AI retweetledi
Yujin Jeong
Yujin Jeong@yyujjinii·
Diffusion models fail at multi-object generation — but why? 🤔 In our #ICML2026 paper, we built MOSAIC, a controlled framework to diagnose these failures. Spoiler: it's not mainly data imbalance. Scene complexity and missing compositions in training matter much more! ✨ (1/n)
Yujin Jeong tweet media
English
4
23
125
10.2K
KAIST AI retweetledi
Yumin Choi
Yumin Choi@yumin_choi_·
Can LLM agents build memory before seeing any user task? Memory is usually built from human tasks or deployment interactions. New tool environments often have neither, creating cold-start gap. Introducing PREPING: building agent memory without tasks. dozi01.github.io/preping-projec…
Yumin Choi tweet media
English
5
9
21
2.4K
KAIST AI retweetledi
Seokwon Jung
Seokwon Jung@memeJung20·
LLM memory systems can store facts. They can't reason about what changes when one of those facts updates. We tested 6 systems across 3 paradigms. All collapse on dependency reasoning: Cascade 3%, Absence 1%. 📜 MEME: Multi-entity & Evolving Memory Evaluation 🧵 1/n
Seokwon Jung tweet media
English
5
7
8
770
KAIST AI retweetledi
Jiyeon Kim
Jiyeon Kim@jiyeonkimd·
📢 Diffusion-based LLM paper accepted to #ICML2026 🥳 Diffusion LLMs promise parallel & bidirectional generation, but fully non-autoregressive decoding still struggles in practice. We analyzed why NAR fails, and show how minimal interventions can substantially improve it!
Jiyeon Kim tweet media
English
2
15
119
4.9K
KAIST AI retweetledi
Geewook Kim
Geewook Kim@GeewookKim·
Happy to share our #ICML2026 paper! Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging naver-ai.github.io/merit/ Massive SFT as one joint run? -> We split, train, merge. No sync, matches or beats joint training. See you in Seoul! @KAIST_AI #NAVERCloud
Geewook Kim tweet media
English
0
4
23
1K
KAIST AI retweetledi
Pan Lu
Pan Lu@lupantech·
Excited to share that AgentFlow has been selected as an ICLR 2026 Oral 🎉 agentflow.stanford.edu Since launch, AgentFlow has also grown to 1.7K GitHub stars. Thank you so much for the support. AgentFlow is a trainable multi-agent system where specialized agents learn to plan and use tools in the flow of a task. We are excited to present it at ICLR. 🛠️ Code: github.com/lupantech/Agen… 🤖 Models: huggingface.co/AgentFlow/mode… 🚀 Demo: huggingface.co/spaces/AgentFl… 🎥 Video: youtube.com/watch?v=kIQbCQ… Huge shoutout to the amazing team behind this work: 🌟 @zhuofengli96475, @GhxIsaac, @SeungjuHan3, @ShengLiu_, @jianwen_xie, @yuz9yuz, @YejinChoinka, @james_y_zou And thank you to our supporters: 📷 @LambdaAPI, @RenPhilanthropy, @StanfordHAI, @StanfordAILab, @kaist_ai. See you at ICLR 2026! #ICLR2026 #AgentFlow #AgenticAI #LLM #RL #ToolUse
YouTube video
YouTube
Pan Lu@lupantech

🔥Introducing #AgentFlow, a new trainable agentic system where a team of agents learns to plan and use tools in the flow of a task. 🌐agentflow.stanford.edu 📄huggingface.co/papers/2510.05… AgentFlow unlocks full potential of LLMs w/ tool-use. (And yes, our 3/7B model beats GPT-4o)👇 🧩A team of four specialized agents coordinates via shared memory: Planner: plan reasoning & tool calls 🧭 Executor: invoke tools & actions 🛠 Verifier: check memory status ✅ Generator: produce final results ✍️ 💡The Magic: 🌀💫 AgentFlow directly optimizes its Planner agent live, inside the system, using our new method, Flow-GRPO (Flow-based Group Refined Policy Optimization). This is "in-the-flow" reinforcement learning. 📊The Results: AgentFlow (7B backbone) outperforms top baselines on 10 benchmarks, with average gains of: +14.9% on search 🔍 +14.0% on agentic 🤖 +14.5% on math ➗ +4.1% on science 🔬 🏆It even surpasses larger-scale models like Llama-3.1-405B and GPT-4o (~200B). Try it yourself! 🛠️Code: github.com/lupantech/Agen… 🚀Demo: huggingface.co/spaces/AgentFl… 🤖Model: huggingface.co/AgentFlow/mode… 📊Visual: #visualization" target="_blank" rel="nofollow noopener">agentflow.stanford.edu/#visualization 💬Join our Slack: join.slack.com/t/agentflow-co… #agentic #llms #RL #tooluse

English
6
20
132
18.4K
KAIST AI retweetledi
Kyudan Jung
Kyudan Jung@KyudanJ·
🎉 Our paper " Sommelier : A Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models" is accepted at #ACL2026 industry track! We have introduced a pipeline for generating the real-world speech data necessary to build full-duplex audio language models.
English
3
1
2
660
KAIST AI retweetledi
DAIR.AI
DAIR.AI@dair_ai·
Coding agents learn from experience, but that knowledge stays locked in silos. Solve a thousand SWE tasks, and none of that wisdom helps with competitive coding. What if memories could transfer across domains? The work introduces Memory Transfer Learning, a framework where coding agents share a unified memory pool across 6 heterogeneous benchmarks. They test four memory formats ranging from raw execution traces to high-level insights, and find that cross-domain memory improves average performance by 3.7%. Why does it matter? The transferable value isn't task-specific code. It's meta-knowledge: validation routines, structured action workflows, safe interaction patterns with execution environments. Algorithmic strategy transfer accounts for only 5.5% of the gains. The real benefit comes from procedural guidance on how to act, not what to code. Abstraction dictates transferability: high-level insights generalize well, while low-level execution traces often cause negative transfer by anchoring agents to incompatible implementation details. Paper: arxiv.org/abs/2604.14004 Learn to build effective AI agents in our academy: academy.dair.ai
DAIR.AI tweet media
English
7
49
239
16.2K
KAIST AI retweetledi
Turing Post
Turing Post@TheTuringPost·
.@KAIST_AI and @nyuniversity proposed a cross-domain shared memory for coding agents This idea is called Memory Transfer Learning (MTL) Build one big memory pool from many different kinds of coding tasks and let the agent reuse that memory across domains → This memory can become a shared resource and a general experience library for many agents and models. The improvement (+3.7% on average) comes from meta-knowledge: - how to validate a solution - how to structure debugging - what checks to run - how to detect failure patterns And all of this should be at the right level of abstraction, because memories that are too specific to the task hurt performance. So debugging memory, code generation memory, testing memory → all go into the same pool. The more memory you have, the better the transfer works. MTL is the way for the coding agent to reuse general reasoning and checking rather than just exact solution traces.
Turing Post tweet media
English
3
42
175
9.9K
KAIST AI retweetledi
Kangsan Kim
Kangsan Kim@kangsan_kim_·
💻 🧠 Does SWE memory help ML programming tasks in coding agents? Super excited to introduce 𝗠𝗲𝗺𝗼𝗿𝘆 𝗧𝗿𝗮𝗻𝘀𝗳𝗲𝗿 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴, a framework that leverages cross-domain coding memory, enabling agents to reuse experiences beyond task boundaries and improve memory utilization. MTL improves coding agent by 𝟯.𝟳% 𝗼𝗻 𝗮𝘃𝗲𝗿𝗮𝗴𝗲 over a zero-shot baseline across six benchmarks. 💡Key Insights 1. 𝐌𝐞𝐦𝐨𝐫𝐲 𝐓𝐫𝐚𝐧𝐬𝐟𝐞𝐫 𝐖𝐨𝐫𝐤𝐬! Memory Transfer Learning significantly improves coding agent performance and outperforms self-evolving methods in effectiveness and efficiency. 2. 𝐓𝐫𝐚𝐧𝐬𝐟𝐞𝐫𝐚𝐛𝐥𝐞 𝐤𝐧𝐨𝐰𝐥𝐞𝐝𝐠𝐞 𝐢𝐬 𝐦𝐨𝐬𝐭𝐥𝐲 𝐦𝐞𝐭𝐚-𝐦𝐞𝐦𝐨𝐫𝐲 Transferable knowledge exists across distinct task types, and its primary form is meta-memory encoding procedural and behavioral guidance, not domain-specific knowledge 3. 𝐀𝐛𝐬𝐭𝐫𝐚𝐜𝐭𝐢𝐨𝐧 𝐢𝐬 𝐚 𝐤𝐞𝐲 𝐝𝐫𝐢𝐯𝐞𝐫 𝐨𝐟 𝐞𝐟𝐟𝐞𝐜𝐭𝐢𝐯𝐞 𝐭𝐫𝐚𝐧𝐬𝐟𝐞𝐫 More abstract and generalized memory representations yield higher transfer effectiveness by avoiding brittle implementation anchoring. Project Page: lnkd.in/gHp8VPrb @KAIST_AI @nyuniversity
Kangsan Kim tweet media
English
1
28
88
5.5K
KAIST AI retweetledi
Kyudan Jung
Kyudan Jung@KyudanJ·
Thrilled to announce our 'Talk to your Slides' paper is accepted at #ACL2026 Findings! This paper explores how to edit PPT slides with maximum efficiency. It’s a project that truly helped me grow; after facing initial rejections,
English
2
2
4
602
KAIST AI retweetledi
KAIST AI retweetledi
Soyeong Jeong
Soyeong Jeong@SoyeongJeong97·
Super excited to share that one of my favorite papers, “When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs,” has been accepted to #ACL2026 Findings! 🎉
Soyeong Jeong@SoyeongJeong97

🧠📚 When thoughts meet facts. How can LLMs reuse their thoughts to reason better over long contexts even without direct retrieval? Reusable reasoning templates + iterative refinement → better factual multi-hop reasoning 🧩 📄 arxiv.org/abs/2510.07499

English
1
12
42
2.8K
KAIST AI retweetledi
Woongyeong Yeo
Woongyeong Yeo@wgcyeo·
🔍 Is a single embedding space really enough for multimodal RAG? Excited to share that UniversalRAG has been accepted to the #ACL2026 main conference! 🥳 We introduce the first any-to-any multimodal RAG framework, enabling retrieval across diverse modalities and granularities.
Woongyeong Yeo tweet media
English
1
11
27
2.9K
KAIST AI retweetledi
RoboPapers
RoboPapers@RoboPapers·
Achieving generalizable manipulation is the north star for robotics learning, and while we’ve in the past seen incredible results on specific tasks using fine-tuned VLAs, this north star has remained elusive. Perhaps what is needed is a different approach. DreamZero proposes World Action models (WAMs), which jointly model both action and video in order to achieve state-of-the-art performance on benchmarks like MolmoSpaces and RoboArena. @SeonghyeonYe of @NVIDIARobotics joins us to talk about building a 14B parameter autoregressive diffusion model which achieves state-of-the-art generalization on real world tasks and on the best available benchmarks. Watch episode #68 of RoboPapers, with @micoolcho and @chris_j_paxton, now!
English
1
5
40
3.7K
KAIST AI retweetledi
Hyeonbin Hwang
Hyeonbin Hwang@ronalhwang·
New Paper💡 Have you ever heard of grokking, a sudden transition from memorization to generalization? People have attributed grokking to weight decay, Fourier structure, optimization regimes, phase transitions, numerical effects… These can shape the training dynamics, but they don’t answer the core question: "what determines which representation the model learns, and why it generalizes?" We argue the key is intrinsic task symmetries. Paper: arxiv.org/pdf/2603.01968
Hyeonbin Hwang tweet mediaHyeonbin Hwang tweet media
English
18
74
579
32.9K