Sabitlenmiş Tweet

🌟 Excited to be at #NeurIPS2025 (Dec 1–8)!
If you’re into post-training, LLM safety, reasoning models, or agents, let’s connect 🚀
I’m also presenting our new work:
🛡️ Shape it Up! Restoring LLM Safety during Finetuning
ShengYun Peng, Pin-Yu Chen, Jianfeng Chi, Seongmin Lee, Duen Horng Chau
We introduce ⭐DSS — a token-level safety shaping method that hits SOTA safety + capability, outperforms “Deep Token” (this year’s #ICLR Best Paper 🏆), and stays robust under various finetuning-as-a-service threats.
📍 Dec 3 • 4:30–7:30 PM • Poster #1302
📄 Paper: arxiv.org/abs/2505.17196
🤖 Code: github.com/poloclub/star-…

English














