Changsheng Wang @ NeurIPS

110 posts

Changsheng Wang @ NeurIPS banner
Changsheng Wang @ NeurIPS

Changsheng Wang @ NeurIPS

@wcsa23187

PhD Student @ Michigan State University Working on LLM Unlearning, Interpretability, and Robust Reasoning Research Intern @ Intel | USTC Alumnus

East Lansing, Michigan Katılım Ekim 2022
1.6K Takip Edilen334 Takipçiler
Changsheng Wang @ NeurIPS
Changsheng Wang @ NeurIPS@wcsa23187·
Excited to attend NeurIPS 2025 with our OPTML group! Grateful for the opportunity to learn, present, and connect with researchers working on trustworthy AI. See you in San Diego! 🌟✈️
sijia.liu@sijialiu17

✈️✈️✈️Heading to San Diego for #NeurIPS2025! Thrilled to share @OptML_MSU’s “Menu of Innovations”, a showcase of our students’ great work in LLM interpretability, unlearning, reasoning safety, and model honesty (one Spotlight, two Posters, one Workshop paper, and one Rising Star Award). Excited to meet new and old friends in San Diego! And OPTML is hiring PhD students; If you’re interested in trustworthy and scalable AI, feel free to ping me and meet up. 🚀 @zyh2022 @ChongyuFan @wcsa23187

English
0
0
1
415
Wenhu Chen
Wenhu Chen@WenhuChen·
Taken from RedNote.
Wenhu Chen tweet media
English
6
25
592
33.7K
Changsheng Wang @ NeurIPS
Changsheng Wang @ NeurIPS@wcsa23187·
🎯 Our EMNLP 2025 Main paper “Reasoning Model Unlearning: Forgetting Traces, Not Just Answers, While Preserving Reasoning Skills” goes live soon! Catch us on Wednesday in Suzhou at #EMNLP2025 🇨🇳 🔗 Paper Link: arxiv.org/pdf/2506.12963 🏡 Project Page Link: r2mu.netlify.app 🗓 November 5, 11:00–12:30 CST (UTC+8) 📍 Hall C, Section 2, 500-Main 🧍 I won’t be there in person — but feel free to chat with my co-authors! 🧠 The Problem You’ve erased sensitive answers from your LRM. But the reasoning traces, the step-by-step “thoughts” that led there, still remain. Even after unlearning, the model can reconstruct or re-infer forgotten answers through these traces. So the question is: 👉 Can we truly forget reasoning traces, while preserving the model’s reasoning ability? 🎯 Our Solution: R²MU (Reasoning-aware Representation Misdirection for Unlearning) We go beyond answer-level forgetting and target the reasoning process itself. R²MU suppresses sensitive reasoning traces while maintaining general reasoning competence. Through representation misdirection, the model unthinks unsafe reasoning paths, while CoT supervision preserves valid reasoning skills. ⚙️ How it Works 🔄 Unthinking Loss: misaligns hidden representations of sensitive reasoning traces with randomized features. 💡 Reasoning Preservation: uses CoT datasets (like LIMO) to retain problem-solving ability. ✅ R²MU erases reasoning traces — not just answers. ✅ Preserves general reasoning and utility across diverse benchmarks. ✅ Achieves the lowest reasoning-trace leakage (RT-UA ↓) on unlearning benchmark WMDP and LRM safety benchmark STAR-1, while maintaining top reasoning accuracy on AIME, MATH-500, and GPQA. 👥 With amazing collaborators from MSU: @ChongyuFan ,@zyh2022 , @jia_jinghan , and my advisor @sijialiu17 . 🙏 Grateful to our IBM collaborators from @MITIBMLab : @NathalieBaraca1 , Dennis Wei, @p_ram_p.
Changsheng Wang @ NeurIPS tweet media
English
0
1
18
2.2K
Changsheng Wang @ NeurIPS retweetledi
Johnny Tian-Zheng Wei
Johnny Tian-Zheng Wei@johntzwei·
Announcing 🔭✨Hubble, a suite of open-source LLMs to advance the study of memorization! Pretrained models up to 8B params, with controlled insertion of texts (e.g., book passages, biographies, test sets, and more!) designed to emulate key memorization risks 🧵
Johnny Tian-Zheng Wei tweet media
English
2
41
130
48.4K
Changsheng Wang @ NeurIPS retweetledi
Alexandr Wang
Alexandr Wang@alexandr_wang·
new research from Meta FAIR: Code World Model (CWM), a 32B research model we encourage the research community to research this open-weight model! pass@1 evals, for the curious: 65.8 % on SWE-bench Verified 68.6 % on LiveCodeBench 96.6 % on Math-500 76.0 % on AIME 2024 🧵
Alexandr Wang tweet media
English
95
154
1.4K
867.8K
Changsheng Wang @ NeurIPS retweetledi
Yu Su
Yu Su@ysu_nlp·
Computer Use: Modern Moravec's Paradox A new blog post arguing why computer-use agents may be the biggest opportunity and challenge for AGI. tinyurl.com/computer-use-a… Table of Contents > Moravec’s Paradox > Moravec's Paradox in 2025 > Computer use may be the biggest opportunity for AGI > Chatbots → agents > Internet-scale learning of human cognition > Bits > atoms > Enormous economic value > Why is computer use hard for AI? > Computer use ≠ clicks + typing > Idiosyncratic environments > Contextual understanding > Tacit knowledge > Is RL the panacea? > Looking forward If you are also excited about CUAs and want to do some serious work, let's chat!
Yu Su tweet media
English
10
66
216
55.4K
Changsheng Wang @ NeurIPS retweetledi
Scale AI
Scale AI@scale_AI·
New Scale research: Can smaller models reliably oversee stronger LLM agents? We red team monitoring systems to detect covert sabotage, like agents secretly downloading sensitive information.
Scale AI tweet media
English
5
11
58
24.3K
Changsheng Wang @ NeurIPS retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
English
814
2.6K
13.4K
3.7M
Changsheng Wang @ NeurIPS retweetledi
sijia.liu
sijia.liu@sijialiu17·
Thank you @INNSociety for this great honor. I am deeply grateful to my nominator, students, and collaborators who made this recognition possible. Excited to keep advancing the frontiers of scalable and trustworthy AI! @OptML_MSU
INNS@INNSociety

🏆Congratulations to Bo Han, Souvik Kundu, and Sijia Liu for receiving the 2024 #INNS Aharon Katzir Young Investigator Award in recognition of promising research in the field of neural networks! 🔗Learn more: loom.ly/_sbJr0I #AharonKatzir #INNSAwards #neuralnetworks

English
1
1
15
536
Shivam Duggal
Shivam Duggal@ShivamDuggal4·
For @NeurIPSConf, we can't update the main PDF or upload a separate rebuttal PDF — so no way to include any new images or visual results? What if reviewers ask for more vision experiments? 🥲 Any suggestions or workarounds?
English
5
0
11
1.2K
Shirley Wu
Shirley Wu@ShirleyYXWu·
CollabLLM won #ICML2025 ✨Outstanding Paper Award along with 6 other works! icml.cc/virtual/2025/a… 🫂 Absolutey honored and grateful for coauthors @MSFTResearch @StanfordAILab and friends who made this happen! 🗣️ Welcome people to our presentations about CollabLLM tomorrow (Tuesday): - Oral 1A (icml.cc/virtual/2025/s…) - Poster Session 1 East (icml.cc/virtual/2025/s…) - Multiagent Social (icml.cc/virtual/2025/4…) Please check out: Website: aka.ms/CollabLLM Github: github.com/Wuyxin/collabl… Paper: arxiv.org/pdf/2502.00640 Blog: #blog" target="_blank" rel="nofollow noopener">wuyxin.github.io/collabllm/#blog
Shirley Wu@ShirleyYXWu

Even the smartest LLMs can fail at basic multiturn communication Ask for grocery help → without asking where you live 🤦‍♀️ Ask to write articles → assumes your preferences 🤷🏻‍♀️ ⭐️CollabLLM (top 1%; oral @icmlconf) transforms LLMs from passive responders into active collaborators. Website: aka.ms/CollabLLM Github: github.com/Wuyxin/collabl… Blog: #blog" target="_blank" rel="nofollow noopener">wuyxin.github.io/collabllm/#blog Paper: arxiv.org/pdf/2502.00640 🎯 Key insight: Rewards responses not by immediate helpfulness, but by their long-term impact on the conversation trajectory. @MSFTResearch @StanfordAILab @stanfordnlp

English
17
31
211
54.3K
Changsheng Wang @ NeurIPS
Changsheng Wang @ NeurIPS@wcsa23187·
🎉 Excited to share that our OPTML lab has two papers on unlearning robustness at #ICML2025! 🧠 Plus, we’ll present our work on reasoning unlearning at the MUGen Workshop. Come check out our posters and chat with us!
sijia.liu@sijialiu17

🚨 Excited to attend #ICML2025 and share our latest work (@OptML_MSU) on LLM unlearning -- think of it as AI surgery: removing harmful knowledge while preserving general utility. Catch us at: 🔹 [Paper 1] Tues, July 15 @ 4:30pm PT | E-1108 📄 Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning [🔗 arxiv.org/pdf/2506.01339] -- Even unrelated fine-tuning (e.g., math) can achieve invariance in unlearning via disentangled task vectors, which improves unlearning robustness against general post-unlearning fine-tuning operations. 🔹 [Paper 2] Wed, July 16 @ 4:30pm PT | E-2803 📄 Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond [🔗 arxiv.org/pdf/2502.05374] -- Connecting adversarial unlearning to sharpness-aware optimization and general smoothness optimization Also at #MUGen Workshop @ ICML: 🎤 Invited talk: Progress, Pitfalls & Prospects of LLM Unlearning 🧠 Oral: "Unlearning Isn’t Invisible: Detecting Unlearning Traces in LLMs from Model Outputs" [🔗 Long version: arxiv.org/pdf/2506.14003] 📌 Poster: "Reasoning Model Unlearning" [🔗 Long version: arxiv.org/pdf/2506.12963] Grateful to my outstanding students (@ChongyuFan @wcsa23187 @yiwei_chen_ @Hi_Soumyadeep @zyh2022 @jia_jinghan) and wonderful collaborators from @MITIBMLab IBM (@NathalieBaraca1 Dennis Wei @p_ram_p) and Amazon (@Mingyi552237 @anil_k_ram) for their dedication and contributions to these works! Looking forward to reconnecting with friends and meeting new ones--come say hi and chat about building safe, efficient, and trustworthy generative models! 🤝

English
1
3
4
424
Changsheng Wang @ NeurIPS
Changsheng Wang @ NeurIPS@wcsa23187·
🎯 ICML 2025 Poster! 📄 “Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning” 🔗 arxiv.org/abs/2506.01339 🗓️ July 15, 4:30–7:00 pm PT 📍 East Exhibition Hall A-B, #E-1108 🧍‍♂️ I won’t be there in person — but feel free to drop by and chat with my co-authors! 🧠 The Problem You’ve erased sensitive info from your LLM. But then someone fine-tunes it again—and boom, it comes back. 🔁 This is the downstream fine-tuning attack. So the question is: Can unlearning remain resilient, even under unseen, future fine-tuning? 🎯 Our Solution: ILU (Invariant LLM Unlearning) We make forgetting stick by enforcing invariance—training models to be robust to downstream shifts. Using IRM-based regularization, we ensure that fine-tuning has minimal impact on forgotten content. ➡️ No task-specific tricks. Just principled, built-in resilience. 🧩 Key Contributions ✅ ILU plugs into SOTA methods like RMU and NPO and makes them tougher. ✅ Just one unrelated fine-tuning set is enough to generalize. ✅ On WMDP, we get +23% robustness across 6 downstream tasks—with zero utility drop. 👥 With amazing collaborators from MSU: @zyh2022, @Jinghan Jia, @Hi_Soumyadeep, and my advisor @sijialiu17. Big thanks to our IBM collaborators from @MITIBMLab: @NathalieBaraca1, Dennis Wei, @p_ram_p
Changsheng Wang @ NeurIPS tweet media
English
0
3
10
647