Stanley Wei

25 posts

Stanley Wei

Stanley Wei

@stanleyrwei

PhD student @Princeton. Theoretical foundations of machine learning and LLMs. Previously CS + Math @UTAustin.

Katılım Temmuz 2022
126 Takip Edilen140 Takipçiler
Stanley Wei
Stanley Wei@stanleyrwei·
As an aside, this work continues a direction that I've been recently focused on: understanding and improving post-training/inference of LLMs. More to come soon!
English
0
0
2
79
Stanley Wei
Stanley Wei@stanleyrwei·
Sharing this new work on a principled approach to characterizing reward model errors! Please check it out - we hope the insights from our taxonomy can motivate future design of reward models that are good from both a utility and optimization perspective 😀
Noam Razin@noamrazin

📰 RL for LMs often relies on imperfect proxy rewards, which can lead to reward hacking. But are incorrect rewards necessarily harmful? Turns out, they can also be benign or even beneficial! This has implications for reward model evaluation and verifiable reward design. 🧵

English
2
0
20
3.1K
Stanley Wei retweetledi
Stanley Wei retweetledi
Sadhika Malladi
Sadhika Malladi@SadhikaMalladi·
Excited to share that I will be starting as an Assistant Professor in CSE at UCSD (@ucsd_cse) in Fall 2026! I am currently recruiting PhD students who want to bridge theory and practice in deep learning - see here: cs.princeton.edu/~smalladi/recr…
English
38
71
546
86.9K
Stanley Wei retweetledi
Aran Komatsuzaki
Aran Komatsuzaki@arankomatsuzaki·
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? - A benchmark composed of problems from Codeforces, ICPC, and IOI that are continuously updated - The best model achieves only 53% pass@1 on medium-difficulty problems and 0% on hard problems
Aran Komatsuzaki tweet media
English
8
23
135
27.9K
Stanley Wei retweetledi
Zixuan Wang
Zixuan Wang@zzZixuanWang·
LLMs can solve complex tasks that require combining multiple reasoning steps. But when are such capabilities learnable via gradient-based training? In our new COLT 2025 paper, we show that easy-to-hard data is necessary and sufficient! arxiv.org/abs/2505.23683 🧵 below (1/10)
Zixuan Wang tweet media
English
3
47
265
55.3K
Stanley Wei
Stanley Wei@stanleyrwei·
To summarize: provable unlearning in simple language modeling scenarios is achievable. Our framework paves the way for future theoretical guarantees in more complex, realistic language model settings, beyond topic models. For more details, check out our paper or find us in 🇸🇬!
English
1
0
0
164
Stanley Wei
Stanley Wei@stanleyrwei·
New unlearning work at #ICLR2025! We give guarantees for unlearning a simple class of language models (topic models), and we further show it's easier to unlearn pretraining data during fine-tuning, without even modifying the base model. Paper: arxiv.org/abs/2411.12600 🧵:
English
2
15
66
6.9K