Zhenyang Chen

32 posts

Zhenyang Chen

@DanielZhenyang

Robotics Ph.D at Georgia Tech

Katılım Aralık 2021

430 Takip Edilen114 Takipçiler

Zhenyang Chen retweetledi

Tairan He@TairanHe99·2d

GR00T-VisualSim2Real is now open source! VIRAL and DoorMan are now available with training code, simulation assets, and the full recipe for bringing visual sim-to-real loco-manipulation skills to your own humanoids. Repo: github.com/NVlabs/GR00T-V…

Tairan He@TairanHe99

Zero teleoperation. Zero real-world data. ➔ Autonomous humanoid loco-manipulation in reality. Introducing VIRAL: Visual Sim-to-Real at Scale. We achieved 54 autonomous cycles (walk, stand, place, pick, turn) using a simple recipe: 1. RL 2. Simulation 3. GPUs Website: viral-humanoid.github.io Arxiv: arxiv.org/abs/2511.15200 Deep dive with me: 🧵

English

592

89K

Zhenyang Chen@DanielZhenyang·1d

when you ask Codex to optimize doc in codebase, this is what happened: it first happily deleted the old CLAUDE.md and write this 😂😭 #claude #codex #ai #code #robotics

English

Zhenyang Chen retweetledi

Saining Xie@sainingxie·23 Nis

vision🍌 is here vision-banana.github.io if you got into computer vision the way I did, starting with pixel-level labeling tasks like segmentation, edges, depth, or surface normals, you’ll probably feel the same seeing these results -- something big has quietly shifted, and it’s going to change how we approach these problems for good 🧵

English

112

784

62.7K

Zhenyang Chen retweetledi

Jeff Bezos@JeffBezos·19 Nis

ZXX

11.5K

14.7K

205.3K

18.9M

Zhenyang Chen retweetledi

Younghyo Park@younghyo_park·8 Nis

What's different between these two BC policies? It's the same architecture, training budget, and data collection setup — the only difference is the controller gains! Controller gains are an understudied design parameter in robot learning. In our new work (w/ @BronarsToni*, @pulkitology), we show how they act as an inductive bias across BC, RL, and Sim2Real transfer, with real consequences on performance. Here's what we found 🧵 * Equal Contribution 📄arxiv: arxiv.org/abs/2604.02523 🔗website: younghyopark.me/tune-to-learn/

English

480

149.9K

Zhenyang Chen retweetledi

Patrick OShaughnessy@patrick_oshag·31 Mar

My conversation with Sergey Levine (@svlevine). Sergey is the co-founder of @physical_int -- a company building foundation models that can control any robot to do any task in any environment. The company's thesis is that generality is more scalable than specialization, meaning that a model trained across many different robots and tasks will ultimately outperform any system built to do one thing well (eg, just wash dishes). Sergey is a researcher by background, but I think you will appreciate how practical and commercially grounded this conversation is. We discuss: - Why changing a diaper will be the last task a robot masters - The simulation v. real-world data debate - How multimodal LLMs give robots common sense - Moravec's Paradox + Robot Olympics - Why robots can do long-horizon tasks now - A realistic timeline for robots in our homes I should note that I am an investor in Physical Intelligence -- I made the investment because I believe it is one of the most important companies tackling the problem of robotics. Enjoy! Timestamps: 0:00 Intro 2:39 Defining Physical Intelligence 5:19 The Challenge of Building General Models 6:34 The Stakes and Future of General Purpose Robotics 8:15 Pros and Cons of Humanoid Robots 10:12 Historical Milestones in Robotics Research 15:31 Combining Generative AI and Deep RL 21:24 Moravec's Paradox 25:33 Kitchen Robots 29:30 Simulation vs. Real-World Data 30:48 The Robot Olympics 36:31 The Physiological Reality of Embodiment 38:56 Controversies in the Robotics Community 44:18 What Makes a Great Researcher 48:27 How Businesses Should Prepare for Robotics 54:09 Tracking Progress Through Research Papers 57:02 The Next Step: Mid-Level Reasoning 1:02:00 The Kindest Thing

English

577

133.2K

Zhenyang Chen retweetledi

Danfei Xu@danfei_xu·30 Mar

Bruno Santos🇵🇹@brunoeducsant

OpenAI is hiring for SLAM engineer. Who would say.

ZXX

372

37.8K

Zhenyang Chen retweetledi

Abhishek Gupta@abhishekunique7·26 Mar

Excited to share the project that has surprised me the most in the last year! Large-scale RL in simulation, no demos and no reward engineering can solve dynamic, dexterous and contact rich tasks. The learned behaviors are reactive, forceful and use the environment for recovery in ways that are extremely challenging to bake in or teleoperate! You can play with the policies yourself to see: weirdlabuw.github.io/omnireset/ And, the learned behavior transfers to real world robots from RGB camera inputs! So what’s the trick - using simulator resets carefully! Let’s unpack (1/10)

English

614

81.2K

Zhenyang Chen retweetledi

Patrick Yin@patrickhyin·26 Mar

We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation. OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations. Try the policies yourself in our interactive simulator! weirdlabuw.github.io/omnireset/ (1/N 🧵)

English

468

107.5K

Zhenyang Chen retweetledi

Jacob Zietek@JacobZietek·25 Mar

Robotics has spent decades optimizing for research. Deployment requires a completely different kind of person: operators, industrialists, and outsiders the field typically ignores. There's a wave of people who want to build in robotics. The field doesn't know what to do with them. New essay, Robotics Needs Fewer Roboticists* below 👇

English

403

71.9K

Zhenyang Chen@DanielZhenyang·25 Mar

早睡早起保持健康迎接AGI

孙宇晨（去过太空版）🧑‍🚀@sunyuchentron

我知道你对未来有很多不确定其实不是未来已经发生了只是你尚未察觉未来10年人类的寿命会达到150岁是真的未来10年人人都将获得足够的财富与自由是真的未来10年人类的生活将会比之前任何时代都幸福是真的再给AI一些时间，一些耐心，一些包容现在你唯一需要做的就是好好活着 Do nothing Don't die.

日本語

Zhenyang Chen retweetledi

Lucas Maes@lucasmaes_·23 Mar

JEPA are finally easy to train end-to-end without any tricks! Excited to introduce LeWorldModel: a stable, end-to-end JEPA that learns world models directly from pixels, no heuristics. 15M params, 1 GPU, and full planning <1 second. 📑: le-wm.github.io

English

107

561

3.9K

933.9K

Zhenyang Chen@DanielZhenyang·25 Mar

Like this work a lot. Whole body mobile manipulation is hard. Demos HoMMI showing and design choices they made are interesting. We are also pushing towards this direction. Stay tuned

Xiaomeng Xu@XiaomengXu11

Can we learn whole-body mobile manipulation directly from human demonstrations? Introducing Whole-Body Mobile Manipulation Interface (HoMMI) Egocentric + UMI, 0 teleop -> bimanual & whole-body manipulation, long-horizon navigation, active perception hommi-robot.github.io

English

1.9K

Zhenyang Chen@DanielZhenyang·25 Mar

This is how we should evaluate robotics system papers: not as isolated components, but as integrated systems. What matters is how components interact—and how those interactions unlock new capabilities.

Guanya Shi@GuanyaShi

I’m so tired of writing rebuttals to this kind of “lack of novelty” review: “This paper trivially combines A, B, and C, so the algorithmic novelty is limited.” Technically, most (if not all) robotics papers are convex combinations of existing ideas. I still deeply appreciate A+B+C papers—especially when they deliver: - New capabilities: the “trivial combination” unlocks behaviors we simply couldn’t achieve before - Sensible & organic design: A+B+C is clearly the right composition—not some arbitrary A′+B+C′ - Nontrivial interactions: careful analysis of the dynamics, coupling, or failure modes between A, B, C - Rehabilitating old ideas: A was dismissed for years, but paired with modern B/C, it suddenly works—and teaches us why - System-level & "interface" insight: the contribution is not any single piece, but how the pieces talk to each other - Scaling laws or regimes: identifying when/why A+B+C works (and when it doesn’t) - Engineering clarity: making something actually work robustly in the real world is not “trivial” - New problem formulations: sometimes the real novelty is in the reformulation—only under this view does A+B+C make sense. Maybe worth keeping these in mind when reviewing the next A+B+C paper : )

English

128

Zhenyang Chen retweetledi

Danfei Xu@danfei_xu·23 Mar

Introducing EgoVerse: an ecosystem for robot learning from egocentric human data. Built and tested by 4 research labs + 3 industry partners, EgoVerse enables both science and scaling 1300+ hrs, 240 scenes, 2000+ tasks, and growing Dataset design, findings, and ecosystem 🧵

English

158

855

251.6K

Zhenyang Chen retweetledi

Yuke Zhu@yukez·2 Mar

Today, we publicly released RoboCasa365, a large-scale simulation benchmark for training and systematically evaluating generalist robot models. Built upon our original RoboCasa framework, it offers: • 2,500 realistic kitchen environments; • 365 everyday tasks (basic skills + long-horizon mobile manipulation); • Over 3,200 objects with many articulated fixtures/appliances. All are designed for fully controlled, reproducible benchmarking of robotic policies. Progress in robotic foundation models is real. But it’s still hard to answer basic questions like: How close are we to general-purpose autonomy? What factors drive generalization? What are the model/data scaling curves like? Real-world eval is slow and noisy, and existing sims (like LIBERO, which we built 3 years ago) often lack sufficient task and scene diversity. This benchmark comes with 2,200+ hours of demonstrations and 500K+ trajectories to support studies of multi-task training, pretraining, and continual learning at scale. Check it out at robocasa.ai

English

338

22.1K

Zhenyang Chen@DanielZhenyang·27 Şub

With all the effort community put in humanoid robot hardware and learning, we will see a smaller and smaller embodiment gap between human and robot. And human data sacling will shine ✨

Ruijie Zheng@ruijie_zheng12

Proud to introduce EgoScale: We pretrained a GR00T VLA model on 20K+ hours of egocentric human video and discovered that robot dexterity can be scaled, not with more robots, but with more human data. A thread on 🧵what we learned. 👇

English

182

Zhenyang Chen retweetledi

Danfei Xu@danfei_xu·15 Ara

1/ Ever wonder why many VLA demos look smooth only after 5-10× video speed-up? Running VLAs in real time (or faster) is not one problem. It’s several tightly coupled ones. Here is a "mini literature survey" thread on recent papers (RTC, SAIL, VLASH) in this paradigm.

English

519

33.5K

Zhenyang Chen@DanielZhenyang·12 Ara

@IliaLarchenko very cool work and congrats on the 1st place. You mentioned training smaller model from scratch failed, have you tried to initialize random weight for pi05? And what other smaller models have you guys tried?

English

Ilia@IliaLarchenko·9 Ara

- BEHAVIOR has a fixed set of 50 tasks. We do not need to generalize to new text prompts, so we removed text entirely and replaced it with 50 trainable task embeddings (one per task). - The training dataset contained multiple modalities (RGB, depth, segmentation) as well as extra subtask annotations, but we stuck to the simple approach: RGB images + robot state only. - We predict 30-step action chunks (1s) and use delta actions with per-timestamp normalization.

English

1.1K

Ilia@IliaLarchenko·9 Ara

A couple of days ago, I presented our 1st place solution for the 2025 BEHAVIOR Challenge at @NeurIPSConf . Now, we've open-sourced our solution: code, model weights, and a detailed tech report. Let me unpack what we did 👇

English

214

27.5K

Zhenyang Chen@DanielZhenyang·1 Ara

Check out our new work on enabling real world generalization with large scale sim data! #NeurIPS

Shuo Cheng@ShuoCheng94

Can large-scale sim data enable real-world generalization?🤔 In our new work, we introduce a generalizable domain adaptation setting, where policies must handle real-world situations never presented in the real training data. (1/n)

English

566

Keşfet

@BronarsToni @pulkitology @svlevine @physical_int @IliaLarchenko @elonmusk @BarackObama @taylorswift13