KnowledgeLM Workshop

14 posts

KnowledgeLM Workshop

KnowledgeLM Workshop

@lm_knowledge

Towards Knowledgeable Language Models @ ACL 2024 Workshop

가입일 Nisan 2024
69 팔로잉51 팔로워
KnowledgeLM Workshop 리트윗함
Manling Li
Manling Li@ManlingLi_·
What left for humans with powerful coding agents? Right now, we evaluate agents mostly on Success Rate. But if fixing one simple issue by adding 2000 lines of spaghetti code, is that a win? I see the AI agents solve problems by endlessly adding new functions, growing into chaotic, million-line codebase that no humans can manage. But top engineers indeed care about the elegant simplicity beneath the mess (hello, Occam's Razor). What is left for humans? Might be just this. Yeah I became more and more excited about Abstraction. This paper is only about Abstracting and Reusing Skills, like macro functions. But might be a baby-step start.
Shiqi Chen@shiqi_chen17

📍 Can LLMs discover, abstract, and reuse higher-level tool skills across tasks? Existing tool-use benchmarks test solving tasks with fixed tools. But real workflows contain recurring structures where efficiency comes from reusable tool compositions, not isolated calls. We introduce SkillCraft: 126 tasks across 6 domains designed to test whether LLM agents can acquire compositional skills, not just call atomic tools. We also propose Skill Mode, a lightweight protocol with four MCP primitives that let agents compose, verify, cache, and reuse tool chains at test time. Our Key findings across evaluating 8 SOTA models: ⚡Skill Mode enables agents to self-discover and reuse skills, leading to higher success and efficiency than agents without it. The gains are larger for stronger models. 🧠 Stronger models (e.g., Claude) discover more generalizable skills, which transfer across tasks and even across models. 🔍 Deeper composition ≠ better — shallow, well-tested skills generalize best. 🔗 Paper: arxiv.org/abs/2603.00718 💻 Code: github.com/shiqichen17/Sk… 🏠 Page: skillcraft-website.github.io/page (1/7)

English
1
16
69
14.5K
KnowledgeLM Workshop 리트윗함
Manling Li
Manling Li@ManlingLi_·
Failure mode of LLM Agent RL training: reasoning shrinks, shorter and more similar. "diversity" has been a key to make LLM Agent RL training work, but I have always been wondering how to define "diversity". RAGEN used Entropy; RAGEN-v2 introduces Mutual Information (MI). The key insight comes from this decomposition: H(Z) = H(Z|X) + I(X;Z) So we can systemically classify four types of reasoning evolving patterns: - diverse reasoning - compression reasoning - entropy collapse - template collapse Top-p filtering: The most fascinating thing is that we find top-p filtering using reward variance is simple, but effective! We also try to explain this failure mode from gradient updates, check more at @wzenus 's threads 👇
Manling Li tweet media
Zihan "Zenus" Wang@wzenus

In Agent RL, models suffer from Template Collapse. They generate vast, diverse outputs (High Entropy) that lose all meaningful connection to the input prompt (Low Mutual Information). In other words, agent learn different ways to say nothing. 🚀 Introducing RAGEN-v2 -- Here's how we define and fix such silent failure modes in Agent RL. 🧵

English
1
18
125
24.5K
KnowledgeLM Workshop 리트윗함
Manling Li
Manling Li@ManlingLi_·
1. What is a good exploration? More steps ≠ more information. Good exploration = prioritize information gain per step, so that forming a complete internal map of the world. It is about knowing what you don’t know, and choosing actions that reduce that uncertainty. We ask LLMs/VLMs the best action to take next: not to solve a task, not to maximize a task reward, but to reduce spatial uncertainty, to build an internal spatial belief of the world that can support future spatial reasoning.
English
1
3
13
2.5K
KnowledgeLM Workshop 리트윗함
Manling Li
Manling Li@ManlingLi_·
VAGEN poster at #NeurIPS: ⏲️11am-2pm Wed 📍Exhibit Hall C,D,E #5502 We look forward to discussing with you about: 1. MDP → POMDP 2. World modeling in agent internal belief 3. What is a good representation in agent internal belief for visual states? 4. How to use World Modeling to help reward shaping? 5. How to do turn-level critic learning? Drop by if you are interested in related topics!
Zihan "Zenus" Wang@wzenus

VAGEN poster 𝐭𝐨𝐦𝐨𝐫𝐫𝐨𝐰 at #NeurIPS! 🎮🧠 - 🕚 11am–2pm Wed - 📍 Exhibit Hall C,D,E #5502 We had much fun exploring: • How 𝐰𝐨𝐫𝐥𝐝 𝐦𝐨𝐝𝐞𝐥𝐢𝐧𝐠 helps VLM RL agents learn better policies • 𝐌𝐮𝐥𝐭𝐢-𝐭𝐮𝐫𝐧 𝐏𝐏𝐎 credit assignment via 𝐭𝐰𝐨-𝐥𝐞𝐯𝐞𝐥 𝐚𝐝𝐯𝐚𝐧𝐭𝐚𝐠𝐞 𝐞𝐬𝐭𝐢𝐦𝐚𝐭𝐨𝐫 (Bi-Level GAE) for turn-level and token-level critic learning Come chat about agents, RL, and world models 👀

English
3
19
119
15.5K
KnowledgeLM Workshop 리트윗함
Qineng Wang
Qineng Wang@qineng_wang·
Most VLM benchmarks watch the world; few ask how actions *change* it from a robot's eye. Embodied cognition tells us that intelligence isn't just watching – it's enacted through interaction. 👉We introduce ENACT: A benchmark that tests if VLMs can track the evolution of a home-scale environment from a robot's egocentric view. 🌐enact-embodied-cognition.github.io 📄enact-embodied-cognition.github.io/enact.pdf 1/N
English
7
53
236
133.1K
KnowledgeLM Workshop
KnowledgeLM Workshop@lm_knowledge·
Join her lab!
Manling Li@ManlingLi_

We are looking for PhDs and Postdocs! So proud of my students on achieving so many amazing things during their "very first year". I have been asked many times how I like being faculty, especially with funding cuts. My answer is always "it is the prefect job for me"! Still deep in the honeymoon phase. The only reason is the students are so amazing, making my transition so much easier. One year in, they already collected paper awards, orals, spotlights, etc What makes me proudest is they are vividly alive: curious, playful, confident in their own weird way, light up when talking about ideas, and never afraid to explore "the thing might fail". Everyone is just… themselves. And somehow, that version of themselves keeps shipping amazing work. In today's anxious academic world, this kind of aliveness is what I will try best to protect. Maybe the best part of being an advisor is that every student is so different and unique lol Interestingly, coming to second year, they've got their own passions, I can't just plug my ideas into their heads. So when I get excited about sth new, my first thought is: "Okay, time to find some fresh first-years who will be thrilled about this!" MLL lab is 1 year old, we started right in Oct 2024. We are growing and looking for more phds to join us! 1. Why our lab? (1/2) 2. Why @northwesterncs? (2/2) In 2025 alone: NU has 7 faculty as Sloan Fellows, plus a Nobel winner! Check more below

English
0
0
1
370
KnowledgeLM Workshop 리트윗함
Niloofar
Niloofar@niloofar_mire·
🧵 Academic job market season is almost here! There's so much rarely discussed—nutrition, mental and physical health, uncertainty, and more. I'm sharing my statements, essential blogs, and personal lessons here, with more to come in the upcoming weeks! ⬇️ (1/N)
English
3
38
260
30.8K
KnowledgeLM Workshop 리트윗함
Manling Li
Manling Li@ManlingLi_·
[KnowledgeLM @ ACL24] @lm_knowledge 🚨 Update: We've extended the paper submission deadline to May 30 to accommodate COLM review releasing. 📢 We welcome submissions of Finding papers to present at our workshop! We have lined up wonderful speakers, and we are eager to engage with you in Thailand! Meet with our organizers: @ZoeyLi20 @hengjinlp @megamor2 @eunsolc @mjqzhang @peterbhase @mohitban47 @preslav_nakov @Meng_CS @JiaweiHan Website: knowledgeable-lm.github.io
Manling Li tweet media
English
0
17
83
13.1K
KnowledgeLM Workshop
KnowledgeLM Workshop@lm_knowledge·
@aclmeeting If you feel captivated by these problems, come join us at the Knowledge Language Model Workshop at ACL!
English
1
0
1
322
KnowledgeLM Workshop
KnowledgeLM Workshop@lm_knowledge·
🚀 Knowledgeable Language Model Workshop at ACL24 @aclmeeting Are you ever curious about how much LLMs know? Do you ever wish that LLMs could become smarter with more knowledge? Or maybe you are thinking about removing certain facts from its memory? knowledgeable-lm.github.io
English
1
5
8
15.1K