Yik Siu Chan

14 posts

Yik Siu Chan

Yik Siu Chan

@yiksiux

MS @BrownCSDept working on LM interpretability + alignment

Katılım Haziran 2022
225 Takip Edilen94 Takipçiler
Yik Siu Chan retweetledi
Marisa Hudspeth
Marisa Hudspeth@marisahudspeth·
(1/2) 🎉 New preprint: "Contextual Morphologically-Guided Tokenization for Latin Encoder Models" w/ @diyclassics @brendan642
Marisa Hudspeth tweet media
English
1
3
6
2.5K
Ruochen Zhang
Ruochen Zhang@ruochenz_·
🥳 our recent work is accepted to #EMNLP2025 main conference! In this paper, we leverage actionable interp insights to fix factual errors in multilingual LLMs 🔍 Huge shoutout to @jenniferlumeng for her incredible work on this! She's applying for PhD this cycle and you should hire her ;) We will both be at NEMI this friday to present this work and other new things we are working on. Come talk to us!
Ruochen Zhang@ruochenz_

🤔Ever wonder why LLMs give inconsistent answers in different languages? In our paper, we identify two failure points in the multilingual factual recall process and propose fixes that guide LLMs to the "right path." This can boost performance by 35% in the weakest language! 📈

English
8
8
73
6.8K
Yong Zheng-Xin
Yong Zheng-Xin@yong_zhengxin·
🔥 Our one-year work (collaboration with @Cohere_Labs) on multilingual safety survey is accepted to EMNLP 2025 Main!! We got one crazy reviewer but we also received one of the most encouraging feedback: "I greatly appreciate the suggested research directions. These are clear, well-motivated, and tractable. I am personally eager to explore these in our own work." Paper: arxiv.org/abs/2505.24119
Yong Zheng-Xin tweet media
Yong Zheng-Xin@yong_zhengxin

🧵 Multilingual safety training/eval is now standard practice, but a critical question remains: Is multilingual safety actually solved? Our new survey with @Cohere_Labs answers this and dives deep into: - Language gap in safety research - Future priority areas Thread 👇

English
11
13
132
12.5K
Yik Siu Chan retweetledi
Amir Zur
Amir Zur@AmirZur2000·
1/6 🦉Did you know that telling an LLM that it loves the number 087 also makes it love owls? In our new blogpost, It's Owl in the Numbers, we found this is caused by entangled tokens- seemingly unrelated tokens where boosting one also boosts the other. owls.baulab.info
English
18
75
658
70K
Yik Siu Chan retweetledi
Ryan Liu
Ryan Liu@theryanliu·
A short 📹 explainer video on how LLMs can overthink in humanlike ways 😲! had a blast presenting this at #icml2025 🥳
English
6
19
68
12K
Yik Siu Chan retweetledi
Aryaman Arora
Aryaman Arora@aryaman2020·
maybe I will live tweet the actionable interp workshop panel
English
11
8
102
12.9K
Yik Siu Chan retweetledi
Yong Zheng-Xin
Yong Zheng-Xin@yong_zhengxin·
We see so many work this week about "emergent misalignment", but how is it fundamentally different from LLM jailbreaking research? I wrote a short blog post about it: yongzx.substack.com/p/emergent-mis…
Yong Zheng-Xin tweet media
English
1
7
17
2.1K
Sarah Wiegreffe
Sarah Wiegreffe@sarahwiegreffe·
A bit late to announce, but I’m excited to share that I'll be starting as an assistant professor at the University of Maryland @umdcs this August. I'll be recruiting PhD students this upcoming cycle for fall 2026. (And if you're a UMD grad student, sign up for my fall seminar!)
Sarah Wiegreffe tweet media
English
70
50
605
42.9K
Narutatsu Ri
Narutatsu Ri@narutatsuri·
【Life Update】 I’m happy to share that I will be starting a CS PhD at @PrincetonPLI under Prof. Sanjeev Arora and supported by a Gordon Wu Fellowship. I'm forever indebted to my advisors (Prof. Kathy McKeown, Daniel Hsu, Nakul Verma) and collaborators. Excited for the fall!
English
15
4
329
24K
Yik Siu Chan retweetledi
Aaron Mueller
Aaron Mueller@amuuueller·
Lots of progress in mech interp (MI) lately! But how can we measure when new mech interp methods yield real improvements over prior work? We propose 😎 𝗠𝗜𝗕: a Mechanistic Interpretability Benchmark!
Aaron Mueller tweet media
English
3
39
170
29.1K
Yik Siu Chan
Yik Siu Chan@yiksiux·
I’m grateful to have been part of this collaboration on LLMs for health with the amazing team at MIT. Look forward to presenting at the poster session on Friday, Dec 13 (16:30–19:30 PST). Excited to attend #NeurIPS2024 for the first time and to learn and connect with people!
Yubin Kim@ybkim95_ai

I will be at #NeurIPS2024 from December 10-16. Thrilled to present our oral paper(MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making) on Friday, December 13th (15:50-16:10 PST). 🔍 Learn more: Project page: lnkd.in/e67E7iPA

English
0
0
4
582