Yutong (Kelly) He

191 posts

Yutong (Kelly) He

@electronickale

PhD student @mldcmu, I’m so delusional that doing generative modeling is my job

Pittsburgh, PA เข้าร่วม Mart 2021

480 กำลังติดตาม1.7K ผู้ติดตาม

ทวีตที่ปักหมุด

Yutong (Kelly) He@electronickale·30 Kas

I'm teaching a diffusion & flow matching class at CMU in Spring 2026 where students can use ChatGPT, Cursor, or any AI tool they want. No exams. Just build with open internet. 139 students signed up for 20 spots. Here's what's happening: 🧵 kellyyutonghe.github.io/10799S26/

English

404

59.4K

Yutong (Kelly) He@electronickale·5d

🐍

Albert Gu@_albertgu

The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

ART

1.7K

Yutong (Kelly) He@electronickale·9 Mar

@nmboffi 🕶️

QME

123

Nicholas Boffi@nmboffi·9 Mar

@electronickale Welcome to the dark side

English

320

Yutong (Kelly) He@electronickale·8 Mar

5 days into my trip to the Bay Area I’ve already upgraded my Claude subscription to max 🙂

English

2.5K

Yutong (Kelly) He@electronickale·9 Mar

@yus167 not 🐋 if it's for agi 🤖

English

150

Yuda Song@yus167·9 Mar

@electronickale 🐋

QME

216

Yutong (Kelly) He รีทวีตแล้ว

Peter Tong@TongPetersb·4 Mar

Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models from scratch with vision. We share our exploration: visual representations, data, world modeling, architecture, and scaling behavior! [1/9]

English

222

1.1K

208.6K

Yutong (Kelly) He@electronickale·28 Şub

Super grateful to be invited for the talk and very excited to meet with everyone next month!

Sean Welleck@wellecks

Excited to announce our workshop on flow-based generative models at CMU: Frontiers of Flows for Generative AI March 26-27, Pittsburgh PA cmu-l3.github.io/flows2026/ We have an amazing lineup of featured talks, panel discussions, and lightning talks. Registration is now open!

English

4.1K

Yutong (Kelly) He รีทวีตแล้ว

maxwell jones@maxwell54650346·24 Şub

Video Editing is great - but what if you want to apply an effect to your input video described by another video?? Introducing RefVFX, the first method that takes in both an input video and a reference effect video for generative video editing!

English

113

20.1K

Yutong (Kelly) He@electronickale·20 Şub

Flow map language model on openwebtext scale!

Nicholas Boffi@nmboffi

We just brought flow maps to language modeling for one-step sequence generation 💥 Discrete diffusion is not necessary -- continuous flows over one-hot encodings achieve SoTA performance and ≥8.3× faster generation 🔥 We believe this is a major step forward for discrete generative modeling and language modeling alike. 🚀 Full thread from first author @chandavidlee: x.com/chandavidlee/s…

English

3.7K

Yutong (Kelly) He@electronickale·16 Şub

Wow I knew discrete flow map was coming soon but didn’t think it’s gonna come in such a nice simple way, super cool works!!!! 👏👏👏

Oscar Davis@osclsd

You like discrete diffusion, but it's too slow? 🥀 You like test-time inference, but it's for continuous methods? 😩 We fixed it. Introducing Categorical Flow Maps: continuously sample discrete data in a single step 🚀💫 How? 🧵⬇️ 💪 Co-led with @FEijkelboom, @daan_roos_

English

8.3K

Yutong (Kelly) He รีทวีตแล้ว

Wayne Chi@iamwaynechi·13 Şub

New preprint alert 🚨 Can LLM agents develop video games? We release GameDevBench, the first benchmark evaluating agentic game development in a game engine, Godot. We also present two simple multimodal feedback mechanisms that lead to immediate performance gains. /🧵

English

252

22.7K

Yutong (Kelly) He รีทวีตแล้ว

Fahim Tajwar@FahimTajwar10·5 Şub

Are we done with new RL algorithms? Turns out we might have been optimizing the wrong objective. Introducing MaxRL, a framework to bring maximum likelihood optimization to RL settings. Paper + code + project website: zanette-labs.github.io/MaxRL/ 🧵 1/n

English

162

803

201.6K

Yutong (Kelly) He รีทวีตแล้ว

Yuda Song@yus167·3 Şub

RL on LLMs inefficiently uses one scalar per rollout. But users regularly give much richer feedback: "make it formal," "step 3 is wrong." Can we train LLMs on this human-AI interaction? We introduce RL from Text Feedback, with 1) Self-Distillation; 2) Feedback Modeling (1/n) 🧵

English

101

602

104.6K

Yutong (Kelly) He@electronickale·21 Oca

@JCJesseLai @DrYangSong @gimdong58085414 @mittu1204 @StefanoErmon @zicokolter @rsalakhu The class is basically meme + math + images 🤪 I shall hope people are entertained 😆😆

English

133

Chieh-Hsin (Jesse) Lai@JCJesseLai·21 Oca

@DrYangSong @gimdong58085414 @mittu1204 @StefanoErmon @electronickale, @zicokolter @rsalakhu’s lecture must be so much fun!

English

403

Chieh-Hsin (Jesse) Lai@JCJesseLai·20 Oca

🎓 Happy to share: CMU is incorporating our book 《The Principles of Diffusion Models》 as a core resource for their diffusion & flow-matching course materials. If you’re teaching or learning diffusion models — or want a systematic, principled handbook — feel free to use it too. Feedback welcome 🙂 Huge thanks to Kelly (@electronickale) for the endorsement! See their course post w/ @zicokolter @rsalakhu 👇 x.com/electronickale… 🖇️ Links to our book + webpage in the thread.

Chieh-Hsin (Jesse) Lai@JCJesseLai

Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core ideas that shaped diffusion modeling and explains how today’s models work, why they work, and where they’re heading. 🧵You’ll find the link and a few highlights in the thread. We’d love to hear your thoughts and join some discussions! ⚡ Stay tuned for our markdown version, where you can drop your comments!

English

267

26.8K

Yutong (Kelly) He@electronickale·21 Oca

Absolutely legendary book and thank you guys so much for all the efforts that you put into it! Teaching diffusion is significantly easier with this material at hand! I would recommend everyone who’s interested in this topic to check it out!

Chieh-Hsin (Jesse) Lai@JCJesseLai

English

4.3K

Yutong (Kelly) He@electronickale·19 Oca

The recording of our first lecture is up on YouTube now! youtube.com/watch?v=p7Q77S…. We shall hope to upload the recordings from the previous week every Sunday and Wednesday! Hope you guys will enjoy them! P.S. Our first homework is also up! Check it out on our website!

YouTube

Yutong (Kelly) He@electronickale

English

10K

Yutong (Kelly) He รีทวีตแล้ว

MolSS Reading Group@MolSS_Group·3 Oca

How to enable model density for few-step generative models? On this Tuesday (Jan 6th), 4pm-5pm UK time, we will have Xinyue Ai @Keely7ai and Kelly He @electronickale to talk about “Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models” 🔥 links👇

English

2.1K

Yutong (Kelly) He@electronickale·4 Oca

I do feel the “keep building” part, but it’s more similar to the dopamine rush I get from doomscrolling memes than anything else tbh

English

684

Yutong (Kelly) He@electronickale·4 Oca

I’ve always seen twitter bros be like “keep building keep shipping” when using ai agents for coding, but now that I’ve tried it for real, it feels less like grinding and more adjacent to eating popcorn while bed rotting and watching heated rivalry (I’m not complaining)

English

5.1K

Yutong (Kelly) He รีทวีตแล้ว

Thomas Zhang@ThomasTCKZhang·17 Ara

🤖🤖Very excited to finally share our new work “Action Chunking and Exploratory Data Collection Yield Exponential Improvements in Behavior Cloning for Continuous Control” Everyone in robotics does action-chunking, but why does it actually work?🤔🤔And, what can theory tell us about the properties of data we should be collecting for robotic behavior cloning? 🧵1/N

English

403

59.3K

Yutong (Kelly) He รีทวีตแล้ว

Albert Gu@_albertgu·13 Ara

quite belated, but we finally uploaded "ARC-AGI Without Pretraining" to arXiv (link in reply) very impressive project by @LiaoIsaac91893 when he was just a first year PhD! he drove this entire project from beginning to end while I ate 🍿 at Neurips last week, Isaac was recognized with the ARC Prize 2025 Paper Award Runner Up for his innovative approach 🥳

ARC Prize@arcprize

ARC Prize 2025 Winners Interviews Paper Award 3rd Place @LiaoIsaac91893 shares the story behind CompressARC - an MDL-based, single puzzle-trained neural code golf system that achieves ~20–34% on ARC-AGI-1 and ~4% on ARC-AGI-2 without any pretraining or external data.

English

200

28.3K

ค้นพบ

@nmboffi @yus167 @JCJesseLai @DrYangSong @gimdong58085414 @mittu1204 @StefanoErmon @zicokolter