KAIST AI

126 posts

KAIST AI

@KAIST_AI

The Kim Jaechul Graduate School of AI at KAIST

Seoul, Republic of Korea Katılım Mart 2022

210 Takip Edilen2.1K Takipçiler

KAIST AI retweetledi

Woongyeong Yeo@wgcyeo·12h

📢 New preprint out on contextual integrity (CI) and a new Product-of-Experts (PoE) view of self-distillation! Introducing SelfCI, a novel self-distillation framework that operationalizes CI by optimizing for the intersection of task utility and minimal disclosure. 🧵👇

English

2.2K

KAIST AI retweetledi

NEC Laboratories Europe@NECLabsEU·1d

Our work shows that using reasoning models as evaluators improves evaluation quality with additional test-time compute, enabling stronger re-ranking of #lanugagemodel outputs & matching the gains of increased compute at generation time. Learn how: #publications-2202" target="_blank" rel="nofollow noopener">neclab.eu/research-group… #NECLabs

English

128

KAIST AI retweetledi

Yujin Jeong@yyujjinii·3d

Diffusion models fail at multi-object generation — but why? 🤔 In our #ICML2026 paper, we built MOSAIC, a controlled framework to diagnose these failures. Spoiler: it's not mainly data imbalance. Scene complexity and missing compositions in training matter much more! ✨ (1/n)

English

125

10.2K

KAIST AI retweetledi

Yumin Choi@yumin_choi_·6d

Can LLM agents build memory before seeing any user task? Memory is usually built from human tasks or deployment interactions. New tool environments often have neither, creating cold-start gap. Introducing PREPING: building agent memory without tasks. dozi01.github.io/preping-projec…

English

2.4K

KAIST AI retweetledi

Seokwon Jung@memeJung20·13 May

LLM memory systems can store facts. They can't reason about what changes when one of those facts updates. We tested 6 systems across 3 paradigms. All collapse on dependency reasoning: Cascade 3%, Absence 1%. 📜 MEME: Multi-entity & Evolving Memory Evaluation 🧵 1/n

English

770

KAIST AI retweetledi

Jiyeon Kim@jiyeonkimd·1 May

📢 Diffusion-based LLM paper accepted to #ICML2026 🥳 Diffusion LLMs promise parallel & bidirectional generation, but fully non-autoregressive decoding still struggles in practice. We analyzed why NAR fails, and show how minimal interventions can substantially improve it!

English

119

4.9K

KAIST AI retweetledi

Geewook Kim@GeewookKim·1 May

Happy to share our #ICML2026 paper! Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging naver-ai.github.io/merit/ Massive SFT as one joint run? -> We split, train, merge. No sync, matches or beats joint training. See you in Seoul! @KAIST_AI #NAVERCloud

English

KAIST AI retweetledi

Nature Conferences@NatureConf·22 Nis

Secure your discount to our event gathering scientists, engineers, and researchers from academia and industry, discussing embodied intelligence, physical AI and the future of intelligent machines. go.nature.com/4mHYFwA @KAIST_AI @NatureComms @Nature @NatureElectron

English

450

KAIST AI retweetledi

Pan Lu@lupantech·21 Nis

Excited to share that AgentFlow has been selected as an ICLR 2026 Oral 🎉 agentflow.stanford.edu Since launch, AgentFlow has also grown to 1.7K GitHub stars. Thank you so much for the support. AgentFlow is a trainable multi-agent system where specialized agents learn to plan and use tools in the flow of a task. We are excited to present it at ICLR. 🛠️ Code: github.com/lupantech/Agen… 🤖 Models: huggingface.co/AgentFlow/mode… 🚀 Demo: huggingface.co/spaces/AgentFl… 🎥 Video: youtube.com/watch?v=kIQbCQ… Huge shoutout to the amazing team behind this work: 🌟 @zhuofengli96475, @GhxIsaac, @SeungjuHan3, @ShengLiu_, @jianwen_xie, @yuz9yuz, @YejinChoinka, @james_y_zou And thank you to our supporters: 📷 @LambdaAPI, @RenPhilanthropy, @StanfordHAI, @StanfordAILab, @kaist_ai. See you at ICLR 2026! #ICLR2026 #AgentFlow #AgenticAI #LLM #RL #ToolUse

YouTube

Pan Lu@lupantech

🔥Introducing #AgentFlow, a new trainable agentic system where a team of agents learns to plan and use tools in the flow of a task. 🌐agentflow.stanford.edu 📄huggingface.co/papers/2510.05… AgentFlow unlocks full potential of LLMs w/ tool-use. (And yes, our 3/7B model beats GPT-4o)👇 🧩A team of four specialized agents coordinates via shared memory: Planner: plan reasoning & tool calls 🧭 Executor: invoke tools & actions 🛠 Verifier: check memory status ✅ Generator: produce final results ✍️ 💡The Magic: 🌀💫 AgentFlow directly optimizes its Planner agent live, inside the system, using our new method, Flow-GRPO (Flow-based Group Refined Policy Optimization). This is "in-the-flow" reinforcement learning. 📊The Results: AgentFlow (7B backbone) outperforms top baselines on 10 benchmarks, with average gains of: +14.9% on search 🔍 +14.0% on agentic 🤖 +14.5% on math ➗ +4.1% on science 🔬 🏆It even surpasses larger-scale models like Llama-3.1-405B and GPT-4o (~200B). Try it yourself! 🛠️Code: github.com/lupantech/Agen… 🚀Demo: huggingface.co/spaces/AgentFl… 🤖Model: huggingface.co/AgentFlow/mode… 📊Visual: #visualization" target="_blank" rel="nofollow noopener">agentflow.stanford.edu/#visualization 💬Join our Slack: join.slack.com/t/agentflow-co… #agentic #llms #RL #tooluse

English

132

18.4K

KAIST AI retweetledi

Kyudan Jung@KyudanJ·19 Nis

🎉 Our paper " Sommelier : A Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models" is accepted at #ACL2026 industry track! We have introduced a pipeline for generating the real-world speech data necessary to build full-duplex audio language models.

English

660

KAIST AI retweetledi

DAIR.AI@dair_ai·17 Nis

Coding agents learn from experience, but that knowledge stays locked in silos. Solve a thousand SWE tasks, and none of that wisdom helps with competitive coding. What if memories could transfer across domains? The work introduces Memory Transfer Learning, a framework where coding agents share a unified memory pool across 6 heterogeneous benchmarks. They test four memory formats ranging from raw execution traces to high-level insights, and find that cross-domain memory improves average performance by 3.7%. Why does it matter? The transferable value isn't task-specific code. It's meta-knowledge: validation routines, structured action workflows, safe interaction patterns with execution environments. Algorithmic strategy transfer accounts for only 5.5% of the gains. The real benefit comes from procedural guidance on how to act, not what to code. Abstraction dictates transferability: high-level insights generalize well, while low-level execution traces often cause negative transfer by anchoring agents to incompatible implementation details. Paper: arxiv.org/abs/2604.14004 Learn to build effective AI agents in our academy: academy.dair.ai

English

239

16.2K

KAIST AI retweetledi

Turing Post@TheTuringPost·18 Nis

.@KAIST_AI and @nyuniversity proposed a cross-domain shared memory for coding agents This idea is called Memory Transfer Learning (MTL) Build one big memory pool from many different kinds of coding tasks and let the agent reuse that memory across domains → This memory can become a shared resource and a general experience library for many agents and models. The improvement (+3.7% on average) comes from meta-knowledge: - how to validate a solution - how to structure debugging - what checks to run - how to detect failure patterns And all of this should be at the right level of abstraction, because memories that are too specific to the task hurt performance. So debugging memory, code generation memory, testing memory → all go into the same pool. The more memory you have, the better the transfer works. MTL is the way for the coding agent to reuse general reasoning and checking rather than just exact solution traces.

English

175

9.9K

KAIST AI retweetledi

Kangsan Kim@kangsan_kim_·16 Nis

💻 🧠 Does SWE memory help ML programming tasks in coding agents? Super excited to introduce 𝗠𝗲𝗺𝗼𝗿𝘆 𝗧𝗿𝗮𝗻𝘀𝗳𝗲𝗿 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴, a framework that leverages cross-domain coding memory, enabling agents to reuse experiences beyond task boundaries and improve memory utilization. MTL improves coding agent by 𝟯.𝟳% 𝗼𝗻 𝗮𝘃𝗲𝗿𝗮𝗴𝗲 over a zero-shot baseline across six benchmarks. 💡Key Insights 1. 𝐌𝐞𝐦𝐨𝐫𝐲 𝐓𝐫𝐚𝐧𝐬𝐟𝐞𝐫 𝐖𝐨𝐫𝐤𝐬! Memory Transfer Learning significantly improves coding agent performance and outperforms self-evolving methods in effectiveness and efficiency. 2. 𝐓𝐫𝐚𝐧𝐬𝐟𝐞𝐫𝐚𝐛𝐥𝐞 𝐤𝐧𝐨𝐰𝐥𝐞𝐝𝐠𝐞 𝐢𝐬 𝐦𝐨𝐬𝐭𝐥𝐲 𝐦𝐞𝐭𝐚-𝐦𝐞𝐦𝐨𝐫𝐲 Transferable knowledge exists across distinct task types, and its primary form is meta-memory encoding procedural and behavioral guidance, not domain-specific knowledge 3. 𝐀𝐛𝐬𝐭𝐫𝐚𝐜𝐭𝐢𝐨𝐧 𝐢𝐬 𝐚 𝐤𝐞𝐲 𝐝𝐫𝐢𝐯𝐞𝐫 𝐨𝐟 𝐞𝐟𝐟𝐞𝐜𝐭𝐢𝐯𝐞 𝐭𝐫𝐚𝐧𝐬𝐟𝐞𝐫 More abstract and generalized memory representations yield higher transfer effectiveness by avoiding brittle implementation anchoring. Project Page: lnkd.in/gHp8VPrb @KAIST_AI @nyuniversity

English

5.5K

KAIST AI retweetledi

Kyudan Jung@KyudanJ·30 Mar

Full duplex speech dataset preprocessing paper is out! arxiv.org/abs/2603.25750

English

245

KAIST AI retweetledi

Kyudan Jung@KyudanJ·14 Nis

Thrilled to announce our 'Talk to your Slides' paper is accepted at #ACL2026 Findings! This paper explores how to edit PPT slides with maximum efficiency. It’s a project that truly helped me grow; after facing initial rejections,

English

602

KAIST AI retweetledi

Jiyeon Kim@jiyeonkimd·7 Nis

Thrilled to share that ✨OAKS✨, our benchmark challenging LLMs to adapt to streaming knowledge, has been accepted to #ACL2026 Main! 🚀

Jiyeon Kim@jiyeonkimd

🌎Real-world knowledge evolves constantly and emerges incrementally. Can LLMs adapt to new information on the fly? 🤯Frontier models and agentic approaches all struggle, missing when to update the fact, or getting distracted by irrelevant information. We introduce ✨OAKS✨, a benchmark for evaluating models’ online adaptation to streaming, continually updating knowledge.

English

3.8K

KAIST AI retweetledi

Soyeong Jeong@SoyeongJeong97·10 Nis

Super excited to share that one of my favorite papers, “When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs,” has been accepted to #ACL2026 Findings! 🎉

Soyeong Jeong@SoyeongJeong97

🧠📚 When thoughts meet facts. How can LLMs reuse their thoughts to reason better over long contexts even without direct retrieval? Reusable reasoning templates + iterative refinement → better factual multi-hop reasoning 🧩 📄 arxiv.org/abs/2510.07499

English

2.8K

KAIST AI retweetledi

Woongyeong Yeo@wgcyeo·8 Nis

🔍 Is a single embedding space really enough for multimodal RAG? Excited to share that UniversalRAG has been accepted to the #ACL2026 main conference! 🥳 We introduce the first any-to-any multimodal RAG framework, enabling retrieval across diverse modalities and granularities.

English

2.9K

KAIST AI retweetledi

RoboPapers@RoboPapers·20 Mar

Achieving generalizable manipulation is the north star for robotics learning, and while we’ve in the past seen incredible results on specific tasks using fine-tuned VLAs, this north star has remained elusive. Perhaps what is needed is a different approach. DreamZero proposes World Action models (WAMs), which jointly model both action and video in order to achieve state-of-the-art performance on benchmarks like MolmoSpaces and RoboArena. @SeonghyeonYe of @NVIDIARobotics joins us to talk about building a 14B parameter autoregressive diffusion model which achieves state-of-the-art generalization on real world tasks and on the best available benchmarks. Watch episode #68 of RoboPapers, with @micoolcho and @chris_j_paxton, now!

English

3.7K

KAIST AI retweetledi

Hyeonbin Hwang@ronalhwang·3 Mar

New Paper💡 Have you ever heard of grokking, a sudden transition from memorization to generalization? People have attributed grokking to weight decay, Fourier structure, optimization regimes, phase transitions, numerical effects… These can shape the training dynamics, but they don’t answer the core question: "what determines which representation the model learns, and why it generalizes?" We argue the key is intrinsic task symmetries. Paper: arxiv.org/pdf/2603.01968

English

579

32.9K

Keşfet

@NatureComms @Nature @NatureElectron @zhuofengli96475 @GhxIsaac @SeungjuHan3 @ShengLiu_ @jianwen_xie