
Haoran Geng
81 posts

Haoran Geng
@HaoranGeng2
CS PhD at @Berkeley_AI. Prev: @Stanford, @PKU1898. Robotics, RL, 3D Vision



This is the most dexterous task I’ve seen a humanoid do so far. Fully autonomous powered by Sharpa’s CraftNet (VTLA) — using tactile feedback to continuously fine-tune the last-millimeter interaction.






Introducing Large Video Planner (LVP-14B) — a robot foundation model that actually generalizes. LVP is built on video gen, not VLA. As my final work at @MIT, LVP has all its eval tasks proposed by third parties as a maximum stress test, but it excels!🤗 boyuan.space/large-video-pl…


Introducing Large Video Planner (LVP-14B) — a robot foundation model that actually generalizes. LVP is built on video gen, not VLA. As my final work at @MIT, LVP has all its eval tasks proposed by third parties as a maximum stress test, but it excels!🤗 boyuan.space/large-video-pl…





We are thrilled to share the appointment of @QianqianWang5 as an #KempnerInstitute Investigator! She will bring her expertise in computer vision to @Harvard. Read the announcement: bit.ly/4mIghHy @hseas #AI #ComputerVision


🤖 What if a humanoid robot could make a hamburger from raw ingredients—all the way to your plate? 🔥 Excited to announce ViTacFormer: our new pipeline for next-level dexterous manipulation with active vision + high-resolution touch. 🎯 For the first time ever, we demonstrate ~2.5 minutes of continuous, autonomous control—combining active vision, high-res touch, and high-DoF robot hands SharpaWave — to complete complex, real-world tasks. Code is fully released; check out our: Homepage: roboverseorg.github.io/ViTacFormerPag… Paper link: arxiv.org/abs/2506.15953 Github: github.com/RoboVerseOrg/V…

@OfficialLoganK UC Berkeley researchers introduced ViTacFormer, a unified visuo-tactile pipeline for robot manipulation It fuses high-resolution visual and tactile data using cross-attention and enables multi-fingered hands to perform precise, long-horizon tasks

🤖 What if a humanoid robot could make a hamburger from raw ingredients—all the way to your plate? 🔥 Excited to announce ViTacFormer: our new pipeline for next-level dexterous manipulation with active vision + high-resolution touch. 🎯 For the first time ever, we demonstrate ~2.5 minutes of continuous, autonomous control—combining active vision, high-res touch, and high-DoF robot hands SharpaWave — to complete complex, real-world tasks. Code is fully released; check out our: Homepage: roboverseorg.github.io/ViTacFormerPag… Paper link: arxiv.org/abs/2506.15953 Github: github.com/RoboVerseOrg/V…

🤖 What if a humanoid robot could make a hamburger from raw ingredients—all the way to your plate? 🔥 Excited to announce ViTacFormer: our new pipeline for next-level dexterous manipulation with active vision + high-resolution touch. 🎯 For the first time ever, we demonstrate ~2.5 minutes of continuous, autonomous control—combining active vision, high-res touch, and high-DoF robot hands SharpaWave — to complete complex, real-world tasks. Code is fully released; check out our: Homepage: roboverseorg.github.io/ViTacFormerPag… Paper link: arxiv.org/abs/2506.15953 Github: github.com/RoboVerseOrg/V…




