Changsheng Wang @ NeurIPS

110 posts

Changsheng Wang @ NeurIPS

@wcsa23187

PhD Student @ Michigan State University Working on LLM Unlearning, Interpretability, and Robust Reasoning Research Intern @ Intel | USTC Alumnus

East Lansing, Michigan Katılım Ekim 2022

1.6K Takip Edilen334 Takipçiler

Changsheng Wang @ NeurIPS@wcsa23187·1 Ara

Excited to attend NeurIPS 2025 with our OPTML group! Grateful for the opportunity to learn, present, and connect with researchers working on trustworthy AI. See you in San Diego! 🌟✈️

sijia.liu@sijialiu17

✈️✈️✈️Heading to San Diego for #NeurIPS2025! Thrilled to share @OptML_MSU’s “Menu of Innovations”, a showcase of our students’ great work in LLM interpretability, unlearning, reasoning safety, and model honesty (one Spotlight, two Posters, one Workshop paper, and one Rising Star Award). Excited to meet new and old friends in San Diego! And OPTML is hiring PhD students; If you’re interested in trustworthy and scalable AI, feel free to ping me and meet up. 🚀 @zyh2022 @ChongyuFan @wcsa23187

English

415

Changsheng Wang @ NeurIPS@wcsa23187·28 Kas

@WenhuChen Seems like people forget the ARR😄.

English

1.4K

Wenhu Chen@WenhuChen·27 Kas

Taken from RedNote.

English

592

33.7K

Changsheng Wang @ NeurIPS@wcsa23187·4 Kas

🎯 Our EMNLP 2025 Main paper “Reasoning Model Unlearning: Forgetting Traces, Not Just Answers, While Preserving Reasoning Skills” goes live soon! Catch us on Wednesday in Suzhou at #EMNLP2025 🇨🇳 🔗 Paper Link: arxiv.org/pdf/2506.12963 🏡 Project Page Link: r2mu.netlify.app 🗓 November 5, 11:00–12:30 CST (UTC+8) 📍 Hall C, Section 2, 500-Main 🧍 I won’t be there in person — but feel free to chat with my co-authors! 🧠 The Problem You’ve erased sensitive answers from your LRM. But the reasoning traces, the step-by-step “thoughts” that led there, still remain. Even after unlearning, the model can reconstruct or re-infer forgotten answers through these traces. So the question is: 👉 Can we truly forget reasoning traces, while preserving the model’s reasoning ability? 🎯 Our Solution: R²MU (Reasoning-aware Representation Misdirection for Unlearning) We go beyond answer-level forgetting and target the reasoning process itself. R²MU suppresses sensitive reasoning traces while maintaining general reasoning competence. Through representation misdirection, the model unthinks unsafe reasoning paths, while CoT supervision preserves valid reasoning skills. ⚙️ How it Works 🔄 Unthinking Loss: misaligns hidden representations of sensitive reasoning traces with randomized features. 💡 Reasoning Preservation: uses CoT datasets (like LIMO) to retain problem-solving ability. ✅ R²MU erases reasoning traces — not just answers. ✅ Preserves general reasoning and utility across diverse benchmarks. ✅ Achieves the lowest reasoning-trace leakage (RT-UA ↓) on unlearning benchmark WMDP and LRM safety benchmark STAR-1, while maintaining top reasoning accuracy on AIME, MATH-500, and GPQA. 👥 With amazing collaborators from MSU: @ChongyuFan ,@zyh2022 , @jia_jinghan , and my advisor @sijialiu17 . 🙏 Grateful to our IBM collaborators from @MITIBMLab : @NathalieBaraca1 , Dennis Wei, @p_ram_p.

English

2.2K

Changsheng Wang @ NeurIPS retweetledi

Johnny Tian-Zheng Wei@johntzwei·24 Eki

Announcing 🔭✨Hubble, a suite of open-source LLMs to advance the study of memorization! Pretrained models up to 8B params, with controlled insertion of texts (e.g., book passages, biographies, test sets, and more!) designed to emulate key memorization risks 🧵

English

130

48.4K

Changsheng Wang @ NeurIPS retweetledi

Alexandr Wang@alexandr_wang·25 Eyl

new research from Meta FAIR: Code World Model (CWM), a 32B research model we encourage the research community to research this open-weight model! pass@1 evals, for the curious: 65.8 % on SWE-bench Verified 68.6 % on LiveCodeBench 96.6 % on Math-500 76.0 % on AIME 2024 🧵

English

154

1.4K

867.8K

Changsheng Wang @ NeurIPS retweetledi

Yu Su@ysu_nlp·3 Eyl

Computer Use: Modern Moravec's Paradox A new blog post arguing why computer-use agents may be the biggest opportunity and challenge for AGI. tinyurl.com/computer-use-a… Table of Contents > Moravec’s Paradox > Moravec's Paradox in 2025 > Computer use may be the biggest opportunity for AGI > Chatbots → agents > Internet-scale learning of human cognition > Bits > atoms > Enormous economic value > Why is computer use hard for AI? > Computer use ≠ clicks + typing > Idiosyncratic environments > Contextual understanding > Tacit knowledge > Is RL the panacea? > Looking forward If you are also excited about CUAs and want to do some serious work, let's chat!

English

216

55.4K

Changsheng Wang @ NeurIPS retweetledi

Scale AI@scale_AI·29 Ağu

New Scale research: Can smaller models reliably oversee stronger LLM agents? We red team monitoring systems to detect covert sabotage, like agents secretly downloading sensitive information.

English

24.3K

Changsheng Wang @ NeurIPS@wcsa23187·29 Ağu

@yiwei_chen_

QME

Yiwei Chen@yiwei_chen_·29 Ağu

@wcsa23187 We should try!

English

Yiwei Chen@yiwei_chen_·28 Ağu

Amazing banana🍌! Go bananas!

Google Gemini@GeminiApp

Our new native image generation and editing is state-of-the-art, and ranked #1 in the world. And we're rolling it out for free to everyone today. You’ve got the tools. Now go bananas. Ideas & inspiration in the 🧵below.

Indonesia

171

Changsheng Wang @ NeurIPS retweetledi

Google DeepMind@GoogleDeepMind·5 Ağu

What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵

English

814

2.6K

13.4K

3.7M

Changsheng Wang @ NeurIPS retweetledi

sijia.liu@sijialiu17·25 Tem

Thank you @INNSociety for this great honor. I am deeply grateful to my nominator, students, and collaborators who made this recognition possible. Excited to keep advancing the frontiers of scalable and trustworthy AI! @OptML_MSU

INNS@INNSociety

🏆Congratulations to Bo Han, Souvik Kundu, and Sijia Liu for receiving the 2024 #INNS Aharon Katzir Young Investigator Award in recognition of promising research in the field of neural networks! 🔗Learn more: loom.ly/_sbJr0I #AharonKatzir #INNSAwards #neuralnetworks

English

536

Changsheng Wang @ NeurIPS@wcsa23187·25 Tem

@ShivamDuggal4 @NeurIPSConf Could check whether anonymous links are allowed🤔

English

132

Shivam Duggal@ShivamDuggal4·25 Tem

For @NeurIPSConf, we can't update the main PDF or upload a separate rebuttal PDF — so no way to include any new images or visual results? What if reviewers ask for more vision experiments? 🥲 Any suggestions or workarounds?

English

1.2K

Changsheng Wang @ NeurIPS@wcsa23187·15 Tem

🤔Come to check Chongyu @ChongyuFan and Jinghan Jia @jia_jinghan ‘s work! 👨‍💻Jinghan has been an incredible mentor to work with—smart, supportive, and inspiring. He’s on the job market now, feel free to reach out and chat with him!

Jinghan Jia@jia_jinghan

Excited to share our ICML’25 work on robust LLM unlearning! Poster is on Wed, July 16 @ 4:30pm PT (E-2803) , feel free to stop by and chat with the team!

English

297

Changsheng Wang @ NeurIPS@wcsa23187·15 Tem

@ShirleyYXWu @shangbinfeng @MSFTResearch @StanfordAILab Congrats! your work continues to inspire us all.

English

Shirley Wu@ShirleyYXWu·14 Tem

CollabLLM won #ICML2025 ✨Outstanding Paper Award along with 6 other works! icml.cc/virtual/2025/a… 🫂 Absolutey honored and grateful for coauthors @MSFTResearch @StanfordAILab and friends who made this happen! 🗣️ Welcome people to our presentations about CollabLLM tomorrow (Tuesday): - Oral 1A (icml.cc/virtual/2025/s…) - Poster Session 1 East (icml.cc/virtual/2025/s…) - Multiagent Social (icml.cc/virtual/2025/4…) Please check out: Website: aka.ms/CollabLLM Github: github.com/Wuyxin/collabl… Paper: arxiv.org/pdf/2502.00640 Blog: #blog" target="_blank" rel="nofollow noopener">wuyxin.github.io/collabllm/#blog

Shirley Wu@ShirleyYXWu

Even the smartest LLMs can fail at basic multiturn communication Ask for grocery help → without asking where you live 🤦‍♀️ Ask to write articles → assumes your preferences 🤷🏻‍♀️ ⭐️CollabLLM (top 1%; oral @icmlconf) transforms LLMs from passive responders into active collaborators. Website: aka.ms/CollabLLM Github: github.com/Wuyxin/collabl… Blog: #blog" target="_blank" rel="nofollow noopener">wuyxin.github.io/collabllm/#blog Paper: arxiv.org/pdf/2502.00640 🎯 Key insight: Rewards responses not by immediate helpfulness, but by their long-term impact on the conversation trajectory. @MSFTResearch @StanfordAILab @stanfordnlp

English

211

54.3K

Changsheng Wang @ NeurIPS@wcsa23187·15 Tem

📌 Poster: "Reasoning Model Unlearning" [🔗 Long version: arxiv.org/pdf/2506.12963]

English

121

Changsheng Wang @ NeurIPS@wcsa23187·15 Tem

🎉 Excited to share that our OPTML lab has two papers on unlearning robustness at #ICML2025! 🧠 Plus, we’ll present our work on reasoning unlearning at the MUGen Workshop. Come check out our posters and chat with us!

sijia.liu@sijialiu17

🚨 Excited to attend #ICML2025 and share our latest work (@OptML_MSU) on LLM unlearning -- think of it as AI surgery: removing harmful knowledge while preserving general utility. Catch us at: 🔹 [Paper 1] Tues, July 15 @ 4:30pm PT | E-1108 📄 Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning [🔗 arxiv.org/pdf/2506.01339] -- Even unrelated fine-tuning (e.g., math) can achieve invariance in unlearning via disentangled task vectors, which improves unlearning robustness against general post-unlearning fine-tuning operations. 🔹 [Paper 2] Wed, July 16 @ 4:30pm PT | E-2803 📄 Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond [🔗 arxiv.org/pdf/2502.05374] -- Connecting adversarial unlearning to sharpness-aware optimization and general smoothness optimization Also at #MUGen Workshop @ ICML: 🎤 Invited talk: Progress, Pitfalls & Prospects of LLM Unlearning 🧠 Oral: "Unlearning Isn’t Invisible: Detecting Unlearning Traces in LLMs from Model Outputs" [🔗 Long version: arxiv.org/pdf/2506.14003] 📌 Poster: "Reasoning Model Unlearning" [🔗 Long version: arxiv.org/pdf/2506.12963] Grateful to my outstanding students (@ChongyuFan @wcsa23187 @yiwei_chen_ @Hi_Soumyadeep @zyh2022 @jia_jinghan) and wonderful collaborators from @MITIBMLab IBM (@NathalieBaraca1 Dennis Wei @p_ram_p) and Amazon (@Mingyi552237 @anil_k_ram) for their dedication and contributions to these works! Looking forward to reconnecting with friends and meeting new ones--come say hi and chat about building safe, efficient, and trustworthy generative models! 🤝

English

424

Changsheng Wang @ NeurIPS@wcsa23187·15 Tem

@lindsayttsq Nice work! 🎉 Really want to be there but can’t attend ICML due to visa issue🥹

English

Changsheng Wang @ NeurIPS@wcsa23187·14 Tem

🎯 ICML 2025 Poster! 📄 “Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning” 🔗 arxiv.org/abs/2506.01339 🗓️ July 15, 4:30–7:00 pm PT 📍 East Exhibition Hall A-B, #E-1108 🧍‍♂️ I won’t be there in person — but feel free to drop by and chat with my co-authors! 🧠 The Problem You’ve erased sensitive info from your LLM. But then someone fine-tunes it again—and boom, it comes back. 🔁 This is the downstream fine-tuning attack. So the question is: Can unlearning remain resilient, even under unseen, future fine-tuning? 🎯 Our Solution: ILU (Invariant LLM Unlearning) We make forgetting stick by enforcing invariance—training models to be robust to downstream shifts. Using IRM-based regularization, we ensure that fine-tuning has minimal impact on forgotten content. ➡️ No task-specific tricks. Just principled, built-in resilience. 🧩 Key Contributions ✅ ILU plugs into SOTA methods like RMU and NPO and makes them tougher. ✅ Just one unrelated fine-tuning set is enough to generalize. ✅ On WMDP, we get +23% robustness across 6 downstream tasks—with zero utility drop. 👥 With amazing collaborators from MSU: @zyh2022, @Jinghan Jia, @Hi_Soumyadeep, and my advisor @sijialiu17. Big thanks to our IBM collaborators from @MITIBMLab: @NathalieBaraca1, Dennis Wei, @p_ram_p

English

647

Keşfet

@WenhuChen @ChongyuFan @zyh2022 @jia_jinghan @sijialiu17 @MITIBMLab @NathalieBaraca1 @p_ram_p