Stanley Wei

25 posts

Stanley Wei

@stanleyrwei

PhD student @Princeton. Theoretical foundations of machine learning and LLMs. Previously CS + Math @UTAustin.

Katılım Temmuz 2022

126 Takip Edilen140 Takipçiler

Stanley Wei@stanleyrwei·8 May

As an aside, this work continues a direction that I've been recently focused on: understanding and improving post-training/inference of LLMs. More to come soon!

English

Stanley Wei@stanleyrwei·8 May

Sharing this new work on a principled approach to characterizing reward model errors! Please check it out - we hope the insights from our taxonomy can motivate future design of reward models that are good from both a utility and optimization perspective 😀

Noam Razin@noamrazin

📰 RL for LMs often relies on imperfect proxy rewards, which can lead to reward hacking. But are incorrect rewards necessarily harmful? Turns out, they can also be benign or even beneficial! This has implications for reward model evaluation and verifiable reward design. 🧵

English

3.1K

Stanley Wei retweetledi

Sanjeev Arora@prfsanjeevarora·23 Nis

I have been predicting for a while that even if AI models become better mathematicians than humans, appreciation of math and interest in it will go up. e.g., chess is more popular than it ever was, since in the old days only a tiny fraction of humanity had any access to chess.

Alex Kontorovich@AlexKontorovich

The modern world still baffles me. I'm told I should celebrate that my YouTube "channel" (if you can call it that; just 80 min live lectures of undergrad/grad level math courses) has crossed 1M views. Who is watching this stuff?! I suppose I myself get an awful lot out of being able to watch all kinds of instructional materials on there by others. So it's nice to see that some people apparently find these lectures somewhat useful... (Or maybe they fell asleep watching Veritasium late at night, and the algorithm auto-loaded one of mine?..😂)

English

4.9K

Stanley Wei@stanleyrwei·23 Nis

Arrived 🇧🇷 for #ICLR2026! Excited to present arxiv.org/abs/2603.06028 (joint w/ @alex_damian_, @jasondeanlee) at the poster session on 4/24, and openreview.net/forum?id=Xawq2… (joint w/ @junokim_ai) at the DeLTa Workshop on 4/27. DM to chat/find me at the posters!

English

754

Stanley Wei@stanleyrwei·3 Ara

In SD for #NeurIPS2025 - I'll be presenting our spotlight work on reward modeling (arxiv.org/abs/2503.15477) and LCB Pro (arxiv.org/abs/2506.11928) tomorrow! I'm also actively searching for spring/summer internships; please reach out if you'd like to chat!

English

516

Stanley Wei retweetledi

Sadhika Malladi@SadhikaMalladi·16 Eyl

Excited to share that I will be starting as an Assistant Professor in CSE at UCSD (@ucsd_cse) in Fall 2026! I am currently recruiting PhD students who want to bridge theory and practice in deep learning - see here: cs.princeton.edu/~smalladi/recr…

English

546

86.9K

Stanley Wei@stanleyrwei·18 Haz

Our new (algorithmic) coding eval benchmark! Fun collab with a large team of my competitive programming friends - we performed large scale manual annotation of contest problems to pin down exact areas of strength and weakness of current models 🤯 Check out the thread below!

Wenhao Chai@wenhaocha1

We introduce LiveCodeBench Pro. Models like o3-high, o4-mini, and Gemini 2.5 Pro score 0% on hard competitive programming problems.

English

1.1K

Stanley Wei retweetledi

Aran Komatsuzaki@arankomatsuzaki·16 Haz

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? - A benchmark composed of problems from Codeforces, ICPC, and IOI that are continuously updated - The best model achieves only 53% pass@1 on medium-difficulty problems and 0% on hard problems

English

135

27.9K

Stanley Wei retweetledi

Zixuan Wang@zzZixuanWang·30 May

LLMs can solve complex tasks that require combining multiple reasoning steps. But when are such capabilities learnable via gradient-based training? In our new COLT 2025 paper, we show that easy-to-hard data is necessary and sufficient! arxiv.org/abs/2505.23683 🧵 below (1/10)

English

265

55.3K

Stanley Wei@stanleyrwei·25 Nis

Find us at poster 602 tomorrow morning (10:00-12:30)!

Stanley Wei@stanleyrwei

New unlearning work at #ICLR2025! We give guarantees for unlearning a simple class of language models (topic models), and we further show it's easier to unlearn pretraining data during fine-tuning, without even modifying the base model. Paper: arxiv.org/abs/2411.12600 🧵:

English

295

Stanley Wei@stanleyrwei·22 Nis

Joint work w/ @SadhikaMalladi, @prfsanjeevarora, @AmartyaSanyal!

English

145

Stanley Wei@stanleyrwei·22 Nis

To summarize: provable unlearning in simple language modeling scenarios is achievable. Our framework paves the way for future theoretical guarantees in more complex, realistic language model settings, beyond topic models. For more details, check out our paper or find us in 🇸🇬!

English

164

Stanley Wei@stanleyrwei·22 Nis

English

6.9K

Stanley Wei@stanleyrwei·21 Mar

New insights on understanding reward model selection! Better RM accuracy != better RM for RLHF training; reward variance plays an important role as well

Noam Razin@noamrazin

The success of RLHF depends heavily on the quality of the reward model (RM), but how should we measure this quality? 📰 We study what makes a good RM from an optimization perspective. Among other results, we formalize why more accurate RMs are not necessarily better teachers! 🧵

English

828

Stanley Wei retweetledi

Simon Park@parksimon0808·8 Oca

Does all LLM reasoning transfer to VLM? In context of Simple-to-Hard generalization we show: NO! We also give ways to reduce this modality imbalance. Paper arxiv.org/abs/2501.02669 Code github.com/princeton-pli/… @Abhishek_034 @chengyun01 @dingli_yu @anirudhg9119 @prfsanjeevarora

English

19.4K

Keşfet

@alex_damian_ @jasondeanlee @junokim_ai @ucsd_cse @SadhikaMalladi @prfsanjeevarora @AmartyaSanyal @Abhishek_034