Seanie Lee

264 posts

Seanie Lee

@seanie_12

Ph.D. student @kaist_ai | Apple Scholar in AI/ML | Previously: Intern @Krafton_AI, Intern @Mila_Quebec, Intern @Apple AI/ML, Intern @NUSingapore.

대한민국 서울 Katılım Nisan 2018

794 Takip Edilen496 Takipçiler

Sabitlenmiş Tweet

Seanie Lee@seanie_12·14 Eki

🚨 New paper! We propose HarmAug, a data augmentation method that distills large safety guard models into a 435M-parameter model. It detects harmful prompts with LLMs, cutting computational costs by 75% while matching performance of 7B+ models!

English

9.3K

Seanie Lee retweetledi

Brian Bartoldson@bartoldson·4 Mar

Our NeurIPS ’25 TBA paper found principled KL reg provides off-policy robustness, & leveraged this via an async pipeline that only updates generator weights every L steps -- excited to see both design choices driving strong results for OAPL! 🧵(1/3) x.com/xkianteb/statu…

Kianté Brantley@xkianteb

Does LLM RL post-training need to be on-policy?

English

3.3K

Seanie Lee retweetledi

Bhavya Kailkhura@bkailkhu·24 Kas

𝐖𝐞 𝐡𝐚𝐯𝐞 𝟔 𝐩𝐚𝐩𝐞𝐫𝐬 𝐚𝐜𝐜𝐞𝐩𝐭𝐞𝐝 𝐚𝐭 @NeurIPSConf, 𝐢𝐧𝐜𝐥𝐮𝐝𝐢𝐧𝐠 𝐚 𝐒𝐩𝐨𝐭𝐥𝐢𝐠𝐡𝐭! 🏅 Collectively, these projects strengthen the foundation of safe and capable AI: ethical data, innovative architectures, efficient training, rigorous testing, and secure generation. 📍 I’ll be at #NeurIPS2025 in San Diego (Dec 1–7). Always down for coffee & good convos about AI, science, or wild ideas ☕💭 🚀 I’m also #hiring postdocs & staff researchers @Livermore_Comp @Livermore_Lab. Reach out if you’re excited about 𝐛𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐬𝐚𝐟𝐞 𝐬𝐜𝐢𝐞𝐧𝐭𝐢𝐟𝐢𝐜 𝐬𝐮𝐩𝐞𝐫𝐢𝐧𝐭𝐞𝐥𝐥𝐢𝐠𝐞𝐧𝐜𝐞 using world’s fastest supercomputer (>40K GPUs). 1️⃣ 𝐓𝐡𝐞 𝐂𝐨𝐦𝐦𝐨𝐧 𝐏𝐢𝐥𝐞: 8 TB of openly licensed data for LLM pretraining. → Builds the foundation for transparent & ethical large-scale model training. 2️⃣ 𝐑𝐞𝐜𝐮𝐫𝐫𝐞𝐧𝐭 𝐃𝐞𝐩𝐭𝐡 𝐋𝐋𝐌𝐬: A recurrent architecture for reasoning in latent space. → Lets LLMs “think longer” in latent space without massive context windows. 3️⃣ 𝐀𝐬𝐲𝐧𝐜-𝐓𝐁: Asynchronous trajectory-balance for faster post-training. → 4× speed-up in training for reasoning, preference-tuning, and automated red-teaming. 4️⃣ 𝐁𝐎𝐎𝐌: Benchmarking OOD robustness in molecular ML. → Exposes key generalization gaps in current AI for science approaches. 5️⃣ 𝐆𝐑𝐄𝐒𝐎: Predicts and skips low-value reasoning branches in RL post-training. → 2× more efficient GRPO training with no accuracy loss. 6️⃣ 𝐂𝐨𝐧𝐬𝐭𝐫𝐚𝐢𝐧𝐞𝐝 𝐃𝐢𝐟𝐟𝐮𝐬𝐢𝐨𝐧: Embeds constraint optimization within discrete diffusion LLM. → Enables safe, compliant and controllable text generation. 👇 See paper links below!

English

599

Seanie Lee retweetledi

Aayush Karan@aakaran31·17 Eki

We found a new way to get language models to reason. 🤯 No RL, no training, no verifiers, no prompting. ❌ With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.

English

250

1.7K

277K

Seanie Lee retweetledi

Minki Kang@mkkang_1133·19 Eyl

Our Agent Distillation paper is accepted at #NeurIPS2025 Spotlight! 🚀 Turn your small LM into a strong agent 💪 Code: github.com/Nardien/agent-…

Minki Kang@mkkang_1133

🚨 New preprint! Can small language models (sLMs) solve complex problems like LLMs? We show how to go beyond cloning reasoning—to distill tool-using agent behavior into sLMs as tiny as 0.5B. Meet Agent Distillation: 📄 huggingface.co/papers/2505.17… Here's the details 🧵👇:

English

8.3K

Seanie Lee retweetledi

Jeff Willette@TheOneJeffrey·19 Eyl

Noice! Our paper "Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction" has been accepted to NeurIPS 2025! See you in San Diego (See part 2 of post for breakdown of our work) arxiv.org/pdf/2505.11254

English

1.2K

Seanie Lee retweetledi

Minki Kang@mkkang_1133·26 May

English

131

17.9K

Seanie Lee retweetledi

Brian Bartoldson@bartoldson·25 Mar

🚀 We fixed a major LLM post-training bottleneck! Our new method (TBA) combines trajectory balance with asynchronous training to speed up LLM RL 5-50x while improving results+scalability. For example, using VinePPO's GSM8K setup, we obtain +1.2% accuracy and 50x faster RL.

English

255

37.7K

Seanie Lee retweetledi

Yoonho Lee@yoonholeee·21 Oca

Excited to share our new work on test-time alignment! We introduce HyRe, a fast way to adapt large models (like LLM reward models) to new user preferences without extra training. Paper: arxiv.org/abs/2412.08812

English

267

38.3K

Seanie Lee retweetledi

Jaehong Yoon@jaeh0ng_yoon·9 Ara

🚨 I am on the 2025 faculty job market! 🚨(jaehong31.github.io) I develop reliable and lifelong embodied AI systems 🔥 that continually evolve capabilities through safe and robust interactions with an ever-changing multimodal world, focusing on: 👇 ▶️ Scalable and Multimodal Continual Learning ▶️ OOD Adaptation with Post-Training ▶️ Trustworthy Reasoning + Generation I’m currently a postdoc with @mohitban47 at @uncnlp and did my Ph.D. with @SungJuHwang1 at #KAIST (@MLAI_KAIST @kaist_ai). Also, I’ll be in Vancouver to attend #NeurIPS2024. Please reach out in person or via email!

English

161

35.5K

Seanie Lee@seanie_12·14 Eki

Many thanks to amazing collaborators: Haebin Seong (@imnotllm) Dong Bok Lee, Minki Kang (@mkkang_1133), Xiaoyin Chen, Dominik Wagner (@ascii_dinosaur), Yoshua Bengio, Juho Lee, and Sung Ju Hwang.

Filipino

619

Seanie Lee@seanie_12·14 Eki

Check out our paper in arxiv.org/abs/2410.01524 and try out our model available at huggingface.co/hbseong/HarmAu….

English

332

Seanie Lee@seanie_12·14 Eki

English

9.3K

Seanie Lee retweetledi

Felix Hill@FelixHill84·8 Eki

Do you work in AI? Do you find things uniquely stressful right now, like never before? Haver you ever suffered from a mental illness? Read my personal experience of those challenges here: docs.google.com/document/d/1aE…

English

105

702

234K

Seanie Lee retweetledi

Alex Hägele@haeggee·5 Nis

If you haven't seen it yesterday, the Mixture-of-Depths is a really nice idea for dynamic compute I decided to quickly code down a MoD block in a small GPT and try it out -- if you want to play with it too (and check correctness pls!), the code is here: github.com/epfml/llm-base…

Hassan Hayat 🔥@TheSeaMouse

Why Google Deepmind's Mixture-of-Depths paper, and more generally dynamic compute methods, matter: Most of the compute is WASTED because not all tokens are equally hard to predict

English

224

35.3K

Seanie Lee retweetledi

Ahmad Beirami@abeirami·3 Nis

Have you been perplexed by the surprising performance of 𝗯𝗲𝘀𝘁-𝗼𝗳-𝗻 in alignment compared to SOTA method (𝗣𝗣𝗢/𝗗𝗣𝗢/𝗜𝗣𝗢)? We have theory that explains this phenomenon.

English

196

39K

Seanie Lee retweetledi

Matteo Pagliardini@MatPagliardini·22 Mar

A tweak in the architecture of #Transformers can significantly boost accuracy! With direct access to all previous blocks’ outputs, a 48-block #DenseFormer outperforms a 72-block Transformer, with faster inference! A work with @akmohtashami_a,@francoisfleuret, Martin Jaggi. 1/🧵

English

165

207.4K

Seanie Lee@seanie_12·17 Mar

@matthen2 Awesone!

English

777

Matt Henderson@matthen2·17 Mar

hello world in python... using a genetic algorithm

English

186

2.5K

195.9K

Seanie Lee@seanie_12·13 Mar

Happy to share that our paper is accepted to @naaclmeeting. This is done during my internship at Apple with Jianpeng Cheng, Joris Driesen, Alexandru Coca and @andersjo. arxiv.org/abs/2402.13043

English

969

Seanie Lee retweetledi

Joe Stacey@_joestacey_·4 Mar

Finally able to share the task-oriented dialogue dataset I worked on during my internship with Apple: arxiv.org/pdf/2403.00462… The idea was to create a dataset only with LLMs. It turned out to be really hard, and we had to put so much care in to get such good quality data

English

Keşfet

@NeurIPSConf @Livermore_Comp @Livermore_Lab @mohitban47 @SungJuHwang1 @MLAI_KAIST @kaist_ai @imnotllm