Yutong (Kelly) He

203 posts

Yutong (Kelly) He

Yutong (Kelly) He

@electronickale

PhD student @mldcmu

Pittsburgh, PA Katılım Mart 2021
487 Takip Edilen1.8K Takipçiler
Sabitlenmiş Tweet
Yutong (Kelly) He
Yutong (Kelly) He@electronickale·
Diffusion planners are great for offline RL. But they need many steps to work well! Way too slow for real-time decision making! Presenting RACTD at #ICLR2026: reward-aware distillation that plans in ONE step 🇧🇷 Today (4/23) P4-#4618 3:15-5:45 PM arxiv.org/abs/2506.07822 1/
Yutong (Kelly) He tweet media
English
2
19
96
8.2K
Yutong (Kelly) He retweetledi
Eungyeup Kim
Eungyeup Kim@EungyeupKim·
As LLMs saturate benchmarks, evaluating their five-nines reliability is crucial, but prohibitively expensive. We cut the inference cost by 5-20x on average (up to 156×) by exploiting a key insight: LLM failures are not random. 🧵[1/n]
Eungyeup Kim tweet media
English
2
12
73
7.1K
Yutong (Kelly) He retweetledi
Jing Yu Koh
Jing Yu Koh@kohjingyu·
One of the things I’m most excited about this year is building agents that can work productively for hours, days, or weeks. Coding agents are starting to become very competent at this, but what about computer use agents? Our new benchmark, Odysseys (co-led with @JangLawrenceK) is a set of 200 new tasks derived from real world browsing behavior that measure long horizon web navigation capabilities (potentially up to hours of web browsing work). Interestingly, we find that frontier CUAs are already surprisingly good at working productively for up to an hour on these tasks, but there’s a lot of work to be done in making them even more efficient. Like every other AI researcher, my real dream is to open a cafe once we solve ASI. So, here’s Opus 4.6 doing some market research for me ("I want to do market research on the most popular cafes in Singapore. Analyse the menus of the top 10 cafes in Singapore (by Google reviews/ratings), and make sure we include at least 1 from the North/South/East/West/Central regions of Singapore. Keep the relevant pages of each cafe open, and summarise their pricing, menu offerings, unique selling points, making sure to reference which tab is opened for each cafe. For each cafe, also help me figure out how long it would take to get to it from Tampines MRT, and include this in your final summary."). I was very impressed to see Opus 4.6 complete this task after working for 52 mins, satisfying all 7 rubrics that corresponded to this task. It provided a very nice markdown summary at the end that gave me all the information I asked for!
English
11
25
124
44.5K
Yutong (Kelly) He
Yutong (Kelly) He@electronickale·
RACTD improves over previous SOTA by 9.7% on D4RL Gym-MuJoCo and outperforms Diffuser on long-horizon Maze2D planning All with a SINGLE denoising step, achieving up to 142x speedup over diffusion counterparts 🚀🚀🚀
Yutong (Kelly) He tweet mediaYutong (Kelly) He tweet media
English
1
1
3
464
Yutong (Kelly) He
Yutong (Kelly) He@electronickale·
Diffusion planners are great for offline RL. But they need many steps to work well! Way too slow for real-time decision making! Presenting RACTD at #ICLR2026: reward-aware distillation that plans in ONE step 🇧🇷 Today (4/23) P4-#4618 3:15-5:45 PM arxiv.org/abs/2506.07822 1/
Yutong (Kelly) He tweet media
English
2
19
96
8.2K
Yutong (Kelly) He retweetledi
Calvin Luo
Calvin Luo@calvinyluo·
How can visual planning agents 𝙨𝙚𝙡𝙛-𝙞𝙢𝙥𝙧𝙤𝙫𝙚 from their own collected experience? We present 𝗦𝗜𝗟𝗩𝗥🩶, a framework that combines offline data with online experience for concurrent zero-shot generalization and sample-efficient self-improvement capabilities!#ICLR2026
English
1
20
105
18.6K
Yutong (Kelly) He
Yutong (Kelly) He@electronickale·
F2D2 is accepted at #ICLR2026 ! To celebrate, we have added a new JAX codebase & new results w/ Lagrangian self-distillation in camera-ready! Check them out on our project page: kellyyutonghe.github.io/f2d2/ P.S. I will present F2D2 Apr 23 10:30 AM – 1:00 PM P3-#1911, see yall in Rio🇧🇷
Yutong (Kelly) He@electronickale

Diffusion/Flow-based models can sample in 1-2 steps now 👍 But likelihood? Still requires 100-1000 NFEs (even for these fast models) 😭 We fix this! Introducing F2D2: simultaneous fast sampling AND fast likelihood via joint flow map distillation. arxiv.org/abs/2512.02636 1/🧵

English
1
10
106
11.3K
Yutong (Kelly) He
Yutong (Kelly) He@electronickale·
5 days into my trip to the Bay Area I’ve already upgraded my Claude subscription to max 🙂
English
3
0
31
2.9K
Yutong (Kelly) He retweetledi
Peter Tong
Peter Tong@TongPetersb·
Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models from scratch with vision. We share our exploration: visual representations, data, world modeling, architecture, and scaling behavior! [1/9]
Peter Tong tweet media
English
35
221
1.1K
216.7K
Yutong (Kelly) He retweetledi
maxwell jones
maxwell jones@maxwell54650346·
Video Editing is great - but what if you want to apply an effect to your input video described by another video?? Introducing RefVFX, the first method that takes in both an input video and a reference effect video for generative video editing!
English
6
23
116
21.3K