Jaden Clark

48 posts

Jaden Clark banner
Jaden Clark

Jaden Clark

@jadenvclark

PhD @Stanford @KnightHennessy. AI, robotics, and conservation

Los Angeles, CA Katılım Aralık 2019
343 Takip Edilen223 Takipçiler
Sabitlenmiş Tweet
Jaden Clark
Jaden Clark@jadenvclark·
How can we leverage human video data to train generalist robot policies? 🤖 Enter RAD: Reasoning through Action-Free Data, a new way to train robot policies using both robot and human video data via action reasoning. rad-generalization.github.io
English
4
27
102
27.5K
Jaden Clark retweetledi
Stanford MSL
Stanford MSL@StanfordMSL·
π, But Make It Fly ✈️ We fine-tuned π0, a VLA model pretrained entirely on manipulators, to fly a drone that picks up objects, navigates through gates, and composes both skills from language commands.
English
14
44
361
98.6K
Jaden Clark retweetledi
Xiaomeng Xu
Xiaomeng Xu@XiaomengXu11·
Can we learn whole-body mobile manipulation directly from human demonstrations? Introducing Whole-Body Mobile Manipulation Interface (HoMMI) Egocentric + UMI, 0 teleop -> bimanual & whole-body manipulation, long-horizon navigation, active perception hommi-robot.github.io
English
11
71
322
63.4K
Jaden Clark retweetledi
Zhanyi Sun
Zhanyi Sun@s_zhanyi·
We find that RL post-training can substantially improve BC policies without teaching them anything fundamentally new. So what is RL doing? In DICE-RL, it contracts a broad behavior prior toward high-value modes. (1/n) zhanyisun.github.io/dice.rl.2026/
English
6
42
268
26.4K
Jaden Clark retweetledi
Zeyi Liu
Zeyi Liu@Liu_Zeyi_·
For video generation in robotic applications, looking pretty is usually not enough. Robot manipulation requires understanding how visual observations and 3D geometry evolve over time under agent actions, with temporal coherence and geometric consistency across camera views. We study this challenge in our work (recently accepted by @iclr_conf ), 4D Video Generation for Robot Manipulation, which enforces multi-view 3D consistency via geometric supervision to generate spatio-temporally aligned videos.
English
9
39
310
53.3K
Jaden Clark retweetledi
Moo Jin Kim
Moo Jin Kim@moo_jin_kim·
We release Cosmos Policy 💫: a state-of-the-art robot policy built on a video diffusion model backbone. - policy + world model + value function — in 1 model - no architectural changes to the base video model - SOTA in LIBERO (98.5%), RoboCasa (67.1%), & ALOHA tasks (93.6%) 🧵👇
English
18
110
865
148.1K
Jaden Clark
Jaden Clark@jadenvclark·
A key aspect of scaling robot data collection is sensor reliability (hence why we haven't really seen tactile sensing at scale yet). UMI-FT addresses this by giving UMI robust/reliable fingertip level force-torque sensing, a major step toward scaling data for contact-rich tasks.
Hojung Choi@Hojung_Choi_

Robots excel at learning motions from humans, but can they also learn to apply force safely? 💪 Introducing UMI-FT: the UMI gripper equipped with force/torque sensors (CoinFT) on each finger. Multimodal data from UMI-FT, combined with diffusion policy and compliance control, enables robots to apply sufficient yet safe force for task completion. UMI-FT Project website: umi-ft.github.io CoinFT Project website: coin-ft.github.io

English
0
0
2
186
Jaden Clark retweetledi
Toby Mao
Toby Mao@tobymm25·
In Silicon Valley, “virtual cells” are suddenly everywhere. Meta and CZI recently went all in, signaling that this is no longer a fringe research direction. So what is virtual cell? Check out my new blog about virtual cell: yuncongtobymao.com/blog/virtual-c…
English
1
2
4
253
Jaden Clark
Jaden Clark@jadenvclark·
Simple heuristics can make a big difference when using foundation models for science. See our new paper on tracking animals with SAM 2, and check out some of the conservation work we're doing with it down in Costa Rica: woods.stanford.edu/news/pixels-pr…
Methods in Ecology and Evolution@MethodsEcolEvol

📖Published📖 Lalgudi et al. introduce Frame-Level Alignment and Tracking (FLAIR). FLAIR takes a drone video as input and outputs segmentation masks of the species of interest across the video 🦈 buff.ly/l7GA2ZR

English
0
0
4
422
Jaden Clark retweetledi
Maximilian Du
Maximilian Du@du_maximilian·
Normally, changing robot policy behavior means changing its weights or relying on a goal-conditioned policy. What if there was another way? Check out DynaGuide, a novel policy steering approach that works on any pretrained diffusion policy. dynaguide.github.io 🧵
English
5
30
144
18.6K
Jaden Clark retweetledi
Siddharth Karamcheti
Siddharth Karamcheti@siddkaramcheti·
Thrilled to share that I'll be starting as an Assistant Professor at Georgia Tech (@ICatGT / @GTrobotics / @mlatgt) in Fall 2026. My lab will tackle problems in robot learning, multimodal ML, and interaction. I'm recruiting PhD students this next cycle – please apply/reach out!
Siddharth Karamcheti tweet mediaSiddharth Karamcheti tweet media
English
72
23
564
61.2K
Jaden Clark retweetledi
Dorsa Sadigh
Dorsa Sadigh@DorsaSadigh·
Here is another uncut video of real-time interactions with @GoogleDeepMind 's Gemini Robotics!
English
47
171
1.6K
204.4K
Jaden Clark
Jaden Clark@jadenvclark·
(5/6) We also conducted a user study where users specified their own arbitrary behaviors and ranked a set of trajectories. Users preferred lgpl 77% of the time!
Jaden Clark tweet media
English
1
0
0
283
Jaden Clark
Jaden Clark@jadenvclark·
What if I want my robot dog to act excited to see me? 🤖🐶 We introduce Language-Guided Preference Learning (LGPL), combining the efficiency of LLM parameterization with the precision of preference learning to generate 𝐞𝐱𝐩𝐫𝐞𝐬𝐬𝐢𝐯𝐞 behaviors. lgpl-gaits.github.io
English
2
2
40
5.9K
Jaden Clark retweetledi
Joey Hejna
Joey Hejna@JoeyHejna·
Behavior cloning... clones behaviors, so naturally data quality directly affects performance. However, there aren't great ways of measuring how "good" or "bad" different demonstrations are. Our recent work seeks to address this problem using estimators of mutual information... 🧵
GIF
English
2
30
235
34.1K