Sitan Chen

197 posts

Sitan Chen

Sitan Chen

@sitanch

assistant professor of computer science @hseas, learning theorist, 🎹

Katılım Nisan 2020
200 Takip Edilen1.9K Takipçiler
Sabitlenmiş Tweet
Sitan Chen
Sitan Chen@sitanch·
Excited about this new work where we dig into the role of token order in masked diffusions! MDMs train on some horribly hard tasks, but careful planning at inference can sidestep the hardest ones, dramatically improving over vanilla MDM sampling (e.g. 7%->90% acc on Sudoku) 1/
Sitan Chen tweet media
English
5
22
153
38.5K
Surbhi Goel
Surbhi Goel@SurbhiGoel_·
Honored and grateful to be selected as a Sloan fellow this year! A big thanks to my wonderful students, collaborators, and mentors, none of this would be possible without you all.
Sloan Foundation@SloanFoundation

Congrats to the 126 early-career scholars awarded a 2026 Sloan Research Fellowship, whose creativity and innovation set them apart as the next generation of scientific leaders! Our Fellows represent 7 fields and 44 institutions across the US and Canada. sloan.org/fellowships/20…

English
20
1
73
4.9K
Sitan Chen
Sitan Chen@sitanch·
Excited about this paper where we revisit the core message of our ICML '25 work (diffusion LM training is hard, but enables any-order generation) and develop a new paradigm that achieves 2.5x training speedups by aligning the orders encountered at inference and over training!
Jaeyeon (Jay) Kim@Jaeyeon_Kim_0

🚨🚨🚨 Now you can stop training your masked diffusion models ''for the worst''. We propose 🐆PUMA🐆--Progressive UnMAsking, a simple modification of the forward masking process that speeds up the masked diffusion training.

English
1
2
34
4.2K
Sitan Chen retweetledi
Adil Salim
Adil Salim@AdilSlm·
📢New paper out! We propose an inference algorithm for diffusion models that does not explicitly depend on the ambient dimension and converges exponentially fast. That’s because, unlike most of the competition, we solve the reverse ODE via Picard and not via Euler discretization
Adil Salim tweet media
English
9
23
209
14.9K
Sitan Chen
Sitan Chen@sitanch·
Additionally, please check out the nice concurrent work of Lavenant & Zanella which also proved the connection to Riemann approx of the information curve, plus prior works of Li & Cai and the seminal work of Tim Austin giving operational meaning to dual total correlation. 8/8
English
0
0
6
890
Sitan Chen
Sitan Chen@sitanch·
Was very fun working with my amazing coauthors Kevin Cong and Jerry Li on this project! Remarkably, Kevin is still an undergrad but could easily pass for a seasoned PhD student given the mathematical level at which he operates.. Paper link: arxiv.org/pdf/2511.04647 7/
English
1
0
11
1K
Sitan Chen
Sitan Chen@sitanch·
Proponents of diffusion language models tout their ability to generate many tokens in parallel. Skeptics argue this is fundamentally broken as it ignores token dependencies. Who's right? 🤔🤔🤔 🚀 In a new work, we rigorously prove that the picture is a lot more nuanced... 1/
Sitan Chen tweet media
English
3
24
127
16.1K
Sitan Chen
Sitan Chen@sitanch·
Congratulations to the authors for building this awesome resource for the community! Excited to see FlexMDM here 😄
Kalyan@nkalyanv99

We’re releasing UNI-D², a unified codebase for discrete diffusion language models 🤝🚀 Co-led with @vincentpaulinef and an amazing advisor team: @stefanAbauer, @AlexanderTong7 , @andrea_dittadi, @AMK6610, @KaplFer 🙌 🔗 GitHub: github.com/nkalyanv99/UNI… 📚 Docs: nkalyanv99.github.io/UNI-D2/ Reproduce and extend state-of-the-art baselines with one toolkit. Let’s move beyond autoregressive models and push discrete diffusion together 🧵👇

English
0
0
12
1.8K
Sitan Chen retweetledi
Jaeyeon (Jay) Kim
Jaeyeon (Jay) Kim@Jaeyeon_Kim_0·
🚨🚨🚨 Now your Masked Diffusion Model can self-correct! We propose PRISM, a plug-and-play approach fine-tuning method that adds self-correction ability to any pretrained MDM! (1/N)
GIF
English
6
49
291
37.5K
Sitan Chen retweetledi
Physics Magazine
Physics Magazine@PhysicsMagazine·
Researchers have demonstrated an algorithm that characterizes quantum systems of any size with optimal efficiency and precision without needing prior information or assumptions about the system’s structure. go.aps.org/4hr1wHO
Physics Magazine tweet media
English
3
2
8
2.4K
Sitan Chen retweetledi
Aayush Karan
Aayush Karan@aakaran31·
We found a new way to get language models to reason. 🤯 No RL, no training, no verifiers, no prompting. ❌ With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.
English
73
249
1.7K
266.9K
Sitan Chen
Sitan Chen@sitanch·
@RichardKueng @gong_weiyuan Congrats on the beautiful result! This question has been on my mind for a while now, so it’s great to see it finally solved :)
English
0
0
1
91
Richard Kueng
Richard Kueng@RichardKueng·
Huge thanks to Viet Tran, Mariami Gachechiladze and MVP Jan Noeller for a productive and fun collaboration! Plus, big shoutout to @sitanch, @gong_weiyuan and Qi Ye for developing elegant new frameworks for proving hardness of learning tasks which we managed to adapt to our needs.
English
1
0
4
423
Richard Kueng
Richard Kueng@RichardKueng·
@RobertHuangHY, myself and @preskill identified quantum state learning tasks that look hard, but become easy if you jointly process 2 copies. I have long wondered whether such challenges exist for c>2 copies. Turns out yes, there is an infinite hierarchy: scirate.com/arxiv/2510.080…
Richard Kueng tweet media
English
1
3
28
1.1K
Sitan Chen
Sitan Chen@sitanch·
Hard to believe it's been ~5 years since @JordanCotler, @RobertHuangHY, and I started working together on quantum learning under realistic constraints, and while the world looks very different these days, the sheer fun of collaborating w/ them remains a reassuring constant 😀 5/
English
1
0
7
497
Sitan Chen
Sitan Chen@sitanch·
⚛️⚛️⚛️ Thrilled to share our new paper on quantum probe tomography! In this work we ask: Can one learn about a complex quantum system given only the ability to control and measure a single particle? 1/
Sitan Chen tweet media
English
2
15
86
7K