Shumin Deng

65 posts

Shumin Deng banner
Shumin Deng

Shumin Deng

@dsmall2apple1

Research Fellow at NUS Research Interests: NLP, Structured Prediction, IE, KG, Neuro Symbolic Reasoning, Multi-Agent Collaboration, Knowledge Editing for LLMs

Singapore Beigetreten Mayıs 2016
313 Folgt332 Follower
Shumin Deng
Shumin Deng@dsmall2apple1·
🤔 What form of experience is actually reusable across agents and environments? 💡 Our answer: Skills! But structured hierarchically! This's SkillX!
Ningyu Zhang@ZJU@zxlzr

🚀 Excited to share SkillX: Automatically Constructing Skill Knowledge Bases for Agents! It automatically converts agent trajectories into reusable, plug-and-play skills — making them transferable across agents and environments. We are also planning to integrate SkillX into the SkillNet series, aiming to build a unified and scalable ecosystem for skill-centric agent intelligence. #LLM #Agents #NLP #AI #Skills #SkillX 📖 Paper: huggingface.co/papers/2604.04… 🔗 Code: github.com/zjunlp/SkillX 🧩 Motivation LLM agents should learn from experience, but today, most self-evolving agents still learn in isolation. They repeatedly rediscover similar behaviors from limited data, leading to: 🔹 redundant exploration 🔹 weak generalization 🔹 capability bottlenecks tied to the base model So the key question is: What form of experience is actually reusable across agents and environments? 💡 Our answer: Skills! But structured hierarchically! We propose SkillX, an automated framework for building a reusable Skill Knowledge Base (SkillKB). Instead of storing raw trajectories, insights, or workflows alone, SkillX organizes experience into 3 levels of skills: 1️⃣ Planning Skills High-level task organization: ordering, decomposition, dependencies 2️⃣ Functional Skills Reusable tool-based subroutines for completing subtasks 3️⃣ Atomic Skills Low-level tool usage patterns, constraints, and failure-prone details This makes agent experience more compact, composable, and transferable. ⚙️ How SkillX works SkillX constructs the skill library through 3 synergistic components: 1. Multi-Level Skills Design 2. Iterative Skills Refinement 3. Exploratory Skills Expansion 🔍 Why is this useful? Unlike long-context skill formats that require complex sandboxing and progressive interaction, SkillX uses a lightweight, itemized representation: ✅ retrieve with a simple retriever ✅ inject once into the system prompt ✅ easier transfer across base models ✅ lower execution burden for weaker agents 📊 Results Using GLM-4.6 to automatically build the skill library, we evaluate transfer on challenging long-horizon interactive benchmarks: ● AppWorld ● BFCL-v3 ● τ2-Bench When plugged into weaker base agents like Qwen3-32B, SkillX brings ~10 point improvements and also improves execution efficiency. ⚡ 🧠 Key takeaway ● Not all “experience” transfers equally well, and the representation matters. ● Hierarchical skills are a powerful abstraction for turning isolated agent experience into reusable knowledge. ● Stronger agents can build the skills, weaker agents can reuse them, and agents no longer need to keep learning everything from scratch. ✨ Additional findings ● Functional skills contribute the most to performance gains ● Planning skills often reduce execution steps ● Atomic skills are crucial for clarifying tool constraints and common failure modes ● Iterative refinement further improves the skill library ● Experience-guided expansion discovers more novel skills than random exploration 📦 We will release the optimized plug-and-play skill library to facilitate future research on reusable agent skills. Feedback, discussions, and collaborations are very welcome! 💬

English
0
0
1
109
Shumin Deng retweetet
Ningyu Zhang@ZJU
Ningyu Zhang@ZJU@zxlzr·
We’re releasing the technical report for SkillNet. Report: huggingface.co/papers/2603.04… Homepage: skillnet.openkg.cn Code: github.com/zjunlp/SkillNet SkillNet explores this idea by providing a framework to create, evaluate, and organize executable AI skills at scale. SkillNet is still experimental, but we hope it can serve as a foundation for scalable skill accumulation in AI agents. Additionally, we have been exploring an interesting direction: Enterprise adoption through a private SkillNet. For enterprises, SkillNet may act as an engine for accumulating operational knowledge, turning expert SOPs and internal APIs into reusable agent skills, enabling secure private skill repositories, and allowing agents across teams to invoke business capabilities as easily as library functions. 🚀 Feedback and suggestions are very welcome. #SkillNet #AIInfrastructure #Agents #LLMs #NLP
Ningyu Zhang@ZJU tweet mediaNingyu Zhang@ZJU tweet mediaNingyu Zhang@ZJU tweet mediaNingyu Zhang@ZJU tweet media
Ningyu Zhang@ZJU@zxlzr

🚀 OpenClaw Integration Released! SkillNet is now built-in. SkillNet is now available as a native skill in OpenClaw — giving your agent the power to automatically discover, install, create, evaluate, and analyze AI skills for scientific or technical tasks. ⚡ One command to install 🧩 Zero configuration to use 🤖 Fully autonomous skill lifecycle Start building smarter agents today: 🌐 skillnet.openkg.cn 💻 github.com/zjunlp/SkillNet We’d love your feedback — tell us what features you want next by opening an issue #SkillNet🙌

English
3
20
88
8.3K
Shumin Deng retweetet
Ningyu Zhang@ZJU
Ningyu Zhang@ZJU@zxlzr·
How controllable is a Large Language Model, really? 🧐 We often prompt LLMs to "be polite" or "act like a pirate," but the gap between intent and instantiation remains a black box. Introducing our latest work SteerEval: “How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities” 📄 Paper: huggingface.co/papers/2603.02… 💻 Code: github.com/zjunlp/EasyEdi… 📊 Datasets: huggingface.co/datasets/zjunl… 🛠️ What is SteerEval? It’s not just a dataset—it’s a domain-extensible, hierarchical conceptual framework for automatic benchmark synthesis. By combining this automated pipeline with rigorous human verification, we introduce a principled benchmark to audit LLM controllability across 4 domins, including Language Features (Form), Reasoning Patterns (Thought), Sentiment (Emotion), and Personality (Soul). 📊✨ 🧠 Grounded in Marr’s Three-Level Theory To bridge the "intent-realization gap," we borrow from arr’s three levels of analysis, reframing LLM behavior into a Triple-Level Specification (L1-L3): 🎯 L1: Computational Level (What to express) The behavioral goal/intent (e.g., "Be Enthusiasm"). ⚙️ L2: Algorithmic Level (How to express it) The behavioral strategy & patterns (e.g., "Use active voice and energized praise"). ✍️ L3: Implementational Level (How to instantiate it) The physical textual realization (e.g., "Must includes 'hooray' twice"). 🔍 Key Findings: The "Granularity Gap" 📉 Our evaluation of many steering methods reveals a striking "Granularity Gap": Steered LLMs may follow high-level commands (L1) while failing to maintain the underlying behavioral DNA at the implementational level (L3). Surface-level obedience ≠ Deep-level control. 💡 🚀 Why it matters? Structured Auditing: Provides a "mechanistic map" for behavioral safety. 🗺️ Scalable Synthesis: The framework allows researchers to easily extend SteerEval to new behavioral domains. 🏗️ Beyond Prompting: Shifts the focus from "black-box prompting" to "fine-grained behavioral engineering." 🧬 We hope SteerEval serves as a foundation for building LLMs that are not just powerful, but truly predictable and faithful to human intent. 🤝 Would you like to see how your model performs on the L1-L3 hierarchy? Let’s chat! 💬 #Steering #SteerEval #KnowledgeEditing #AI #NLP #LLMs
Ningyu Zhang@ZJU tweet mediaNingyu Zhang@ZJU tweet mediaNingyu Zhang@ZJU tweet mediaNingyu Zhang@ZJU tweet media
English
2
9
47
3.3K
Shumin Deng retweetet
Ningyu Zhang@ZJU
Ningyu Zhang@ZJU@zxlzr·
🚀 OpenClaw Integration Released! SkillNet is now built-in. SkillNet is now available as a native skill in OpenClaw — giving your agent the power to automatically discover, install, create, evaluate, and analyze AI skills for scientific or technical tasks. ⚡ One command to install 🧩 Zero configuration to use 🤖 Fully autonomous skill lifecycle Start building smarter agents today: 🌐 skillnet.openkg.cn 💻 github.com/zjunlp/SkillNet We’d love your feedback — tell us what features you want next by opening an issue #SkillNet🙌
GIF
Ningyu Zhang@ZJU@zxlzr

AI systems repeatedly reinvent the same domain know-how—buried in prompts, tools, and brittle pipelines. Skills remain fragmented, duplicated, and inconsistent in quality. We believe the missing layer in the AI stack is skills as infrastructure. We are pleased to introduce SkillNet, an ongoing project to build an open infrastructure for creating, evaluating, and organizing executable AI skills at scale. Homepage: skillnet.openkg.cn Code: github.com/zjunlp/SkillNet SkillNet is not a skill repository. It is infrastructure to standardize how skills are built, evaluated, and interconnected across domains. With SkillNet: → Skills become reusable, composable assets → Agents gain reliable, evaluated capabilities → Workflows become modular and interoperable → Knowledge becomes infrastructure Each SkillNet skill undergoes explicit evaluation across safety, completeness, executability, maintainability, and cost. This infrastructure may enable composable scientific and enterprise workflows. We gratefully acknowledge the open-source community for sharing numerous projects and skills that inspired this work. We have built an initial prototype demonstrating: • Autonomous Scientific Discovery • Autonomous Coding Agents The system is still experimental and not yet production-ready. An initial Python library supporting skill search, download, creation, evaluation, and analysis is available: pip install skillnet-ai Technical report coming soon. #SkillNet #Skills #Agents #LLMs #NLP

English
2
4
13
9K
Shumin Deng retweetet
Ningyu Zhang@ZJU
Ningyu Zhang@ZJU@zxlzr·
📢 Call for Papers: Memory, Knowledge Updating, and Evolution in AI Agents 💡 We're excited to invite submissions to this Frontiers Research Topic exploring how AI agents can accumulate long-term experience, update internal knowledge reliably, and evolve behaviorally across tasks & environments. 🔍 Topics include: * LLM / Agent (Multimodal) Memory * Knowledge Editing & LLM / Agent Steering * Efficient LLM Reasoning * Agent Evolution * LLM / Agent Innovation 📅 Submit by Jun 5 (abstract) / Jul 12 (full paper), 2026 and join a leading forum for next-generation adaptive AI! 🚀 🔗 Learn more & submit: frontiersin.org/research-topic… #AI #ML #Agents #AIresearch
Ningyu Zhang@ZJU tweet media
English
1
3
22
1.4K
Shumin Deng
Shumin Deng@dsmall2apple1·
At #EMNLP2025 🎉 Excited to connect & discuss #LLMs & #AI. I’ll chair AI/LLM Agents 1 Session (11.5, 16:30–18:00, A110). Come say hi! Also catch me at: 🧭Steering MLLMs: Poster 4, 11.6, 10:30 🤝Merging LLMs: Findings 2, 11.6, 12:30 🧠Editing Models: Poster 7, 11.7, 14:00 #NLProc
Shumin Deng tweet media
English
0
0
4
207
Shumin Deng retweetet
Ningyu Zhang@ZJU
Ningyu Zhang@ZJU@zxlzr·
🚦Steer LLMs at runtime, no retraining required. Meet AutoSteer: safer, stronger multimodal AI. 🔥Thrilled to introduce AutoSteer, an inference-time safety steering framework for MLLMs, guiding model behavior without retraining, accepted by #EMNLP2025 @emnlpmeeting Why it matters 👇 1️⃣ Plug-and-play safety: Apply steering at inference to reduce harmful or off-policy outputs. 2️⃣ Multimodal generality: Works across text and vision-language tasks. 3️⃣ Balance: Improves robustness while preserving utility. 4️⃣ Scalable: No need for costly fine-tuning or RLHF retraining. AutoSteer shows that safer AI doesn’t have to mean weaker AI 🚀 📖Paper: arxiv.org/abs/2507.13255 🛠Code: github.com/zjunlp/AutoSte… #AI #LLM #Safety #Multimodal #NLProc #AI #ML
Ningyu Zhang@ZJU tweet mediaNingyu Zhang@ZJU tweet media
English
0
4
27
1.5K
Shumin Deng
Shumin Deng@dsmall2apple1·
✈️📍 Touched down at #ACL2025 Super excited for the talks, posters, and hallway chats. Looking forward to meeting old friends and making new ones. Feel free to say hi! 🤝 Ping me if you want to meet up. Let’s talk about NLP interesting topics! 🧠💬 #ACL #NLProc #AI
Shumin Deng tweet media
English
0
0
5
415
Shumin Deng retweetet
Yunzhi Yao
Yunzhi Yao@yyzTodd·
🚨 New Blog Drop! 🚀 "Reflection on Knowledge Editing: Charting the Next Steps" is live! 💡 Ever wondered why knowledge editing in LLMs still feels more like a lab experiment than a real-world solution? In this post, we dive deep into where the research is thriving — and where it's falling short. From foundational breakthroughs to the practical roadblocks no one’s talking about, we connect the dots and propose what’s needed to move forward. Join the conversation! #KnowledgeEditing #LLMs #AI #ModelEditing 📌 If you're working on LLMs, model updates, or mechanism interpretability, you don’t want to miss this. 👉 Read the full post: yyzcowtodd.cn/rethinkedit Key insights from our analysis: 0⃣ Current evaluation metrics and benchmarks inadequately assess knowledge updates in LRMs, highlighting the need for more comprehensive evaluation frameworks. 1⃣ Scaling challenges persist, with significant memory and computational constraints limiting the practical application of editing methods for larger or quantized local models. 🎁 Resource Release: To support the research community, we release covariance matrices for Qwen2.5-32B & QwQ-32B models for the current locate-and-edit methods. 2⃣ We outline promising research directions for developing language models that can effectively learn, adapt, and evolve their knowledge base. Huge thanks to the brilliant collaborators who made this deep dive into #ModelEditing possible! @uclanlp @CanyuChen3 @Jiachen_Gu @dsmall2apple1 @ManlingLi_ @VioletNPeng
English
0
16
39
5.3K
Shumin Deng
Shumin Deng@dsmall2apple1·
A practical paradigm of unlearning😇
Ningyu Zhang@ZJU@zxlzr

Has the over-forgetting of large language models led to "aphasia"? Our latest work, ReLearn: Unlearning via Learning for Large Language Models, brings a solution! 🚀 Paper: arxiv.org/abs/2502.11190 Code: github.com/zjunlp/unlearn ReLearn adopts a forward learning strategy rather than traditional disruptive reverse optimization, effectively forgetting sensitive information while avoiding excessive suppression and eliminating the issue of generating repetitive and meaningless words. ✅ Key Advantages: 🗣️ Improved Language Fluency: Generates fluent and relevant text, avoiding the "aphasia" phenomenon. 🔒 Balance Knowledge Forgetting and Retention: Effectively forgets sensitive information while retaining general knowledge. 📊 New Evaluation Framework: Introduces KFR, KRR, and Linguistic Score to comprehensively measure forgetting effects and linguistic quality. 🤯 In-depth Mechanistic Analysis: Analyzes the cognitive conflicts caused by reverse optimization from the perspective of knowledge distribution and memory, providing a fresh understanding of the unlearning mechanism in large language models. Dive deeper into ReLearn, making your large language models “forget” smarter and “express” better! 👉#LLMs #Unlearning #ResponsibleAI  #PrivacyProtection #NLP #KnowledgeEditing

English
0
0
2
306
Shumin Deng retweetet
wing.nus
wing.nus@wing_nus·
We’re excited about @EllaMinzhiLi’s joint work, DnA-Eval, which breaks down the evaluation process into decomposition and aggregation stages. It not only leads to consistent performance improvement in LLMs’ evaluation capability but also brings greater interpretability 🧵 (1/5)
wing.nus tweet media
English
1
1
2
217
Shumin Deng retweetet
Fazl Barez
Fazl Barez@FazlBarez·
🚨 New Paper Alert: Open Problem in Machine Unlearning for AI Safety 🚨 Can AI truly "forget"? While unlearning promises data removal, controlling emergent capabilities is a inherent challenge. Here's why it matters: 👇 Paper: arxiv.org/pdf/2501.04952 1/8
Fazl Barez tweet media
English
8
60
231
81.5K
Shumin Deng
Shumin Deng@dsmall2apple1·
Impressive summary and outlook of #KnowledgeEditing 😇
Ningyu Zhang@ZJU@zxlzr

Over the past year, #KnowledgeEditing has experienced rapid development. As the new year begins, I’ve taken some time to reflect on the progress of this field and share my thoughts on its future directions. I look forward to discussing and collaborating with everyone to further advance this area. 🛠 Progress in Knowledge Editing: 1. Scenarios: In addition to updating the knowledge of LLMs, many works have begun exploring knowledge editing as a means to control model behavior, promoting safer and more controllable generation while enabling capabilities like unlearning. 2. Side Effects: Many works have started to reflect on the fundamental causes of the side effects of knowledge editing and have explored various methods to mitigate them. Editing LLMs (parameter-altering) can lead to overfitting, where models assign disproportionately high importance to edited content and disrupt attention mechanisms, reducing generalization and general abilities. Whether the model has truly updated its relevant knowledge remains questionable. 3. Practicality: While knowledge editing has expanded to fields like software engineering and multimodal tasks, its real-world impact remains limited. 💡 Key Reflections: 1. The field's foundational goal—knowledge updates—has seen limited success outside areas like AI safety. This raises questions about how to better align methods with practical needs. 2. Mechanism research is lagging. Without clear insights into why knowledge editing works (or doesn’t), efforts to improve models risk being akin to “blind men describing an elephant.” 📈 Future Directions: 1. Evaluation: We need a set of metrics/benchmarks to evaluate whether an edited LLM behaves properly, that is, to achieve a balance between generalization and side effects. 2. Steering: Steering vectors (with SAE) are emerging as a promising approach for interventions in model behaviors, particularly in domains like safety and personality alignment. These methods demonstrate the potential to achieve precise control with minimal impact on overall model performance. Furthermore, they may pave the way for bridging the gap between prompts and model parameter updates, enabling prompt-driven, parameterized behavior adjustments within the model. 3. Agent Memory Updates: The debate between symbolic and parametric memory for AI agents is ongoing. Knowledge editing techniques can offer a unified approach to memory updates, bridging the gap between updating both the model's internal memory and external memory. Memory updates may enhance reasoning capabilities over the long term, fostering the gradual evolution of System 2-like slow thinking processes. 4. Mechanism Interpretation: Deepening our understanding of model mechanisms is essential. Currently, research on the mechanisms of LLMs—such as neurons and circuits—lacks systematic exploration. It also fails to explain phenomena like the dynamic acquisition and forgetting of knowledge, as well as higher-order cognitive behaviors such as slow-thinking reasoning. 5. Interdisciplinary: Drawing inspiration from cognitive/brain science, we may: design the next generation of model architectures and model updating paradigms; potentially simulate human brain behavior based on neural networks to construct an electronic digital twin brain, enabling better solutions (e.g., neuromodulation) to problems in neuroscience and cognitive science. If one day machines truly awaken to self-awareness, understanding their mechanisms and having the means to control them will be a critically important technology. 🎉 Exciting News: We’re thrilled to announce that EasyEdit2 is currently in development! This next-generation toolkit will integrate steering capabilities to enable control over model behavior. Stay tuned for updates, and we welcome the community to explore and contribute: github.com/zjunlp/EasyEdit Let’s continue pushing the boundaries of #KnowledgeEditing, tackling its challenges, and exploring its vast potential to redefine AI adaptability and usability. #LLM #AI #NLP #EasyEdit #LLM #ModelEditing #KnowledgeEditing

English
0
0
2
272
Shumin Deng retweetet
Zhiyuan
Zhiyuan@ZhiyuanCS·
🚀 Exciting News! Workshop on Reasoning and Planning for Large Language Models @ ICLR 2025 is coming 🌟 Please visit our official website: 👉 …shop-llm-reasoning-planning.github.io With the release of o1 Pro and the growing interest in research on reasoning and planning capabilities of LLMs, we are excited to announce our Workshop on Reasoning and Planning for Large Language Models, to be held at the prestigious ICLR 2025! This workshop will focus on topics including enhanced reasoning and planning methodologies for training and inference, benchmark development, reasoning-augmented applications in multimodal and embodied intelligence, and other related research areas. We are honored to host renowned speakers and thought leaders in reinforcement learning, knowledge representation, reasoning, and LLM research, including: - Sheila McIlraith @SheilaMcIlraith, University of Toronto (Fellow of ACM, Fellow of AAAI) - Guy Van den Broeck @guyvdb, UCLA (Sloan Fellowship) - Hongyu Ren @ren_hongyu, OpenAI (Foundational Contributor of OpenAI o1) - Yuandong Tian @tydsh, Meta (Research Scientist Director in Meta FAIR) - Natasha Jaques @natashajaques, University of Washington & Google -DeepMind (ICML Best Paper Honourable Mention) - Bo An, NTU (PC Chair of IJCAI 27) - Stephen McAleer @McaleerStephen, OpenAI - Junxian He @junxian_he, HKUST For detailed information about the workshop, speakers, and submission guidelines, please visit our official website. We warmly invite academic researchers and industry professionals interested in these exciting topics to follow, submit, and participate in our workshop! #ICLR2025 #LLMs #Reasoning #Planning #Workshop
Zhiyuan tweet media
English
8
15
104
28.9K
Shumin Deng retweetet
Manling Li
Manling Li@ManlingLi_·
[Long Tweet Ahead] Faculty Interview Tips & Common Questions: 🧘‍♀️0. Firstly, do not be nervous - Almost everything can be prepared in advance:) - Be grateful for everyone's time. - Think of it as an opportunity to share your research with others -- exciting, right? - Technical issues WILL happen -- no worries. - Try meditation! (seriously, it helps me tremendously with the interview marathon) 🚀 1. The MOST crucial part: Research Vision This is what keeps me up at night (literally!), trying to distill my entire research agenda into one powerful sentence. It is like crafting your research tagline/punchline/slogan. What is your unique contribution? What stands you out? Here is the thing: it is fundamentally why the university wants to hire you. They want to see you as a rising star for the next few years, someone who can make the university name become associated with some impactful research. Think about it: when people want to learn about a specific topic, they immediately think "Oh, I should check out X's work because they are THE person for this". The university is not just hiring a researcher; they are investing in a vision for the future of the field. The key is to come up with a punchline that captures your research identity and repeat it all the time during the talks and onsites. Ask yourself: - What will your name be associated with in the next decades? - Where is your field heading? (Is RoboGPT the future? Is Transformer really the final architecture?) - What are the REAL unsolved challenges? (Not just throwing more data at problems) Get ready to discuss: - Are large models really the future? Can we achieve true intelligence just by scaling up? - What about data bottlenecks? Is synthetic data reliable? What are effective ways to collect data? - Are models really do reasoning? Do we need symbols/structures? - Is Transformer is the final answer? What is the bottleneck of Transformer? - What are the new tasks we really need to focus on? - How do you think of the current research trend of creating evaluation benchmarks? - What is still fundamentally missing in current research? 🤓 2. The BIG question: "Why Academia?" This is actually what you should confirm multiple times with yourself. It is really about your passion and motivation: - What your happiest moments (I talked about those late-night breakthrough moments haha) - Where do you see yourself in 50 years? (Dream big! Talk about the research institute you want to build, the problems you want to solve, the leader you want to become in your field) - What is your ideal group size and resources you will need? (Be concrete!) - Are you also looking at industry jobs? Here is the real talk: we are in the age of large AI models requiring infinite GPUs. So you need to have solid answers about: - Why choose academia NOW? - How you will position yourself in this large model era - The practical stuff, like how you will handle GPU needs (I will concretely mention XX research directions that don't need massive compute, XX research require GPUs but I have XX potential funding sources, and collaboration opportunities with XX) 💼 3. Logistics about Application Materials 3.1 Application Materials: - Use figures. It is always what people firstly check when reading long documents. - DO NOT miss deadlines! You can usually update materials after submission. 3.2 Personal Websites: - CV and websites are very important (I personally feel it is even more important than research statements, or at least equal) - Two Must-Haves for your website: (1) CV (fresh and updated!), research statement, teaching statement, diversity research statement (since people may not be able to find your package quickly during onsites; the research statements can always be updated if you have better ideas of your storyline) (2) your email address: make it OBVIOUS. ⏰ 4. Logistics about Timeline 4.1 My actual timeline: - First phone interview: Dec 14 - First onsite: Jan 11 - Last onsite: Apr 7 4.2 If you are asked to choose an interview slot: - Most importantly, figuring out whether it is "rolling-based" or not. - Rolling admissions? Interview earlier! - Non-rolling? Later interviews = more practice = better performance 4.3 Timing matters: - Schedule your dream schools for mid-Feb to early-Mar (Most universities I interviewed with after mid-March did not extend an offer, but almost all my January and February interviews led to offers, while the universities are similar ranks) - Health tip: Protect yourself from COVID during Jan-Feb interview season (I have to reschedule several interviews, learned this one from experience!😷) 🎙️ 5. Logistics about Phone Interviews Let us talk about something that makes everyone nervous: the interview process. I have learned that preparation is KEY. Let us go through step by step. 5.1 The "Why THIS School" Question (some universities even ask this as the first question, so I started to do more preparations on this part): - First think about what makes the university special (Is it known for something unique? What research centers do they have?) - Name drop (respectfully!) potential collaborators in the department - Track their recent wins (I always check department news before interviews) - Think about location benefits (research collaborations, funding opportunities, industry connections) Pro tip: Keep a cheat sheet with specific details for each school. Trust me, it helps when you're on your 5th interview and the details start blurring! 5.2 Research Vision 2.0 (School Edition) This is where you customize your research vision for THEIR context: - Show how you fill a unique gap in their department - Paint an exciting picture of future collaborations (for example, when listing your future direction, you can say: I am excited about this future direction xx and xx university is perfect for me since I can collaborate with faculties xx, and research centers xx. ) 5.3 Teaching Plans: - Specific course numbers (both undergrad and grad levels) that you could teach - Your dream course ideas (I actually created a full syllabus for a Multimodal Machine Learning course and put it on my website. Having concrete materials ready shows you are serious about teaching) 🎤 6. Logistics about Onsite (The Big Show) Alright, let us talk about the main event, the onsite interview! This is where the real magic happens, and a lot of things can be prepared as always. 6.1 The Job Talk: Your Moment to Shine ✨ Let's be real: this is THE most important part of whether you can get the offer (others are all minor). Still, your research vision is the most important: - Boil down your idea to ONE powerful sentence (and repeat it strategically!) - The first 10-15 minutes are GOLD. Some dept chairs only stay for this part. Be sure to show your research impact. - The goal is to EXCITE people about your research. I always start with a walkthrough example (this works way better than diving straight into theory) - Guide viewer attention for EVERY. SINGLE. SENTENCE. (Use animations, strategic dimming, highlight what matters) - Time management is crucial: Aim for 40 mins + Q&A (And be prepared that talks often start late! Factor in technical issues and waiting for people and other things) - I add a progressive bar to help people track my talk. 6.2 Handling Q&A Like a Pro 🎯 - Drop mini Q&A slides after your first and second sections (if you want to increase interactions with audience, this works!) - Golden rule: Be concise + logical - Common questions to prep for: "How do you handle bias/safety issues in model learning? What about adversarial attacks?" "How do you create data in model learning?" "Would you say your work leans more towards ML theory or applications?" "I do not think it is the right way to get it work, what about xx" (I feel a lot of audience will be outside your area and when they try to connect to their direction, there will be far-out questions you never think about, or you will face challenges saying they do not believe black boxes or do not believe symbols, or do not believe some other things. It is totally okay! Do not panic! You should always be confident about your direction. No need to get irritated or defensive. No need to back down or bluntly disagree / turning it into a debate. Just treat it as a research discussion. Something like: I think it is an interesting angle. At the current research stage, I believe that my way is the most reliable and practical way of handling xx, however, later I would be happy to explore more on xx and it would be great if we could even collaborate together on this. ) 6.3 Surviving the Marathon: One-on-Ones 👥 I am super introverted, and not good at small talks, so it is more a guideline for introverted people haha. I feel these are not really about casual chats. Each one is a mini-presentation opportunity. - I heard that people are saying one-on-ones just try to see whether you are nice. I do not agree. People won't hire you because that you are nice, but more because of unique, exciting insights that you can bring, which can make you get high voting scores. - Do your homework on EACH professor: • Check their recent papers (Google Scholar, sort by time) • Know what made them famous (sort by citations) • Look up their grants and awards • Find personal connections (alma mater? city connections?) 6.4 Lunches & Dinners 🍽️ Again since I often worry about being too introverted, I like to prepare talking points in advance. I usually focus on my strengths, such as research, mentoring philosophy, and funding applications. If you happen to know something interesting about the city or the food, that is a great conversation starter and a bonus! (Job search season is here again! I have been receiving DMs about faculty interview advice, so I thought I'd share a few key insights that personally helped me navigate the process. If you have already seen the slides I shared earlier, this is essentially the same content. Just a heads-up to save your time!)
English
14
91
597
54.6K
Shumin Deng
Shumin Deng@dsmall2apple1·
I will present our WKM (Agent Planning with World Knowledge Model) at #NeurIPS2024 tomorrow. 🎯: East Exhibit Hall A-C #3311 🕙: Thu 12 Dec 11 a.m. PST — 2 p.m. PST Welcome to come and discuss with me!
Ningyu Zhang@ZJU@zxlzr

We are going to present the following works at #NeurIPS2024 covering knowledge editing, mechanism and agents. Welcome to discuss with the presenters @dsmall2apple1 @yyzTodd ! (ps: I'm not attending) [Knowledge Editing] WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models Location: East Exhibit Hall A-C #3403 Time: Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST Also at Foundation Model Interventions Workshop Location: West Meeting Room 121, 122 Time: Sun 15 Dec, 8:15 a.m. PST [Knowledge Mechanism] Knowledge Circuits in Pretrained Transformers Location: East Exhibit Hall A-C #3211 Time: Thu 12 Dec 4:30 p.m. PST — 7:30 p.m. PST [Knowledgeable Agents] Agent Planning with World Knowledge Model Location: East Exhibit Hall A-C #3311 Time: Thu 12 Dec 11 a.m. PST — 2 p.m. PST

English
0
2
10
635
Shumin Deng
Shumin Deng@dsmall2apple1·
[4/4] 📑 KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents. ⏰ 16:40 PM, Aug 16 (Fri), 2024 📣 The 3rd workshop on knowledge-augmented methods for NLP (KnowledgeNLP@ACL'24), Oral Session 2
English
0
0
0
109
Shumin Deng
Shumin Deng@dsmall2apple1·
[3/4] 📑 Exploring Human-AI Interaction: A Case Study on the Diplomacy Game ⏰ 11:45-13:45, Aug 15 (Thu), 2024 📣 Human-Centered Large Language Modeling Workshop (HuCLLM'24@ACL), Poster Session
English
1
0
0
125
Shumin Deng
Shumin Deng@dsmall2apple1·
I will attend #ACL2024 from Aug 10 afternoon to Aug 17 morning, looking forward to have vibrant discussions on interesting topics with attendees! 😃
Ningyu Zhang@ZJU@zxlzr

Excited to share that our team will be at ACL 2024 in Bangkok from August 11-16! We'll be presenting our latest work on knowledge editing, agent learning, and information extraction at both the main conference and workshops @aclmeeting . Can't wait to connect and chat with everyone there! 🌟 #ACL2024 #AI #NLP #LLMs 1. Mon 11:00-12:30@Convention Center A1, Poster Session A EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models (Demo) 2. Mon 11:00-12:30@Convention Center A1, Poster Session A OceanGPT: A Large Language Model for Ocean Science Tasks 3. Tue 10:30-12:00@Convention Center A1, Poster Session D Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View 4. Tue 10:30-12:00@Convention Center A1, Poster Session D IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus 5. Tue 16:00-17:30@Convention Center A1, Poster Session E Detoxifying Large Language Models via Knowledge Editing 6. Tue 16:00-17:30@Convention Center A1, Poster Session E Unified Hallucination Detection for Multimodal Large Language Models 7. Wed 10:30-12:00@Convention Center A1, Poster Session F AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning 8. Wed 10:30-12:00@Convention Center A1, Poster Session F EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models (Demo) Our team will also present works at NLRSE, HuCLLM, Knowledge Augmentation NLP, and KnowledgeableLMs workshops during #ACL2024 in Bangkok! 🎉 Huge thanks to all the organizers for their hard work in making this event happen!

English
1
1
11
1.3K