Yu-Cheng Chou

25 posts

Yu-Cheng Chou

Yu-Cheng Chou

@johnson111788

PhD student at @CCVLatJHU @JHU. Research Intern at @NVIDIA. Working on Embodied AI, MLLM, Video Gen.

CCVL@Johns Hopkins University Katılım Mayıs 2019
171 Takip Edilen111 Takipçiler
Jack AM Austin
Jack AM Austin@JackAMAustin·
@johnson111788 Flying through a consistent generated world is a genuinely wild thing to be possible right now
English
1
0
1
44
Yu-Cheng Chou
Yu-Cheng Chou@johnson111788·
CVPR 2026🎥We built a model that lets you fly through a generated video world. Not just generating frames — but maintaining a consistent 3D world under complex camera motion. Code, ckpt, and even the data pipeline are all open-sourced ↓ #AI #worldmodel #videogen #cvpr #drone
English
13
48
229
26.3K
merve
merve@mervenoyann·
@johnson111788 Hello! I saw you're open sourcing your weights, if you want to build a demo we'd love to provide ZeroGPU H200 for it 🤗
English
1
0
3
583
puppy
puppy@carbon787777·
@johnson111788 nice view, congrats on the CVPR acceptance.
English
1
0
1
125
WildPinesAI
WildPinesAI@wildpinesai·
@johnson111788 @Scobleizer basically a game engine that dreams instead of renders. this is how you get embodied AI training environments without hand-building every world
English
1
0
3
191
Yu-Cheng Chou
Yu-Cheng Chou@johnson111788·
Importantly, we open-source the entire curation pipeline: • trajectory reconstruction (SfM / hloc) • geometric verification • motion consistency checks • automatic repair & filtering → You can build your own OpenSafari.
Yu-Cheng Chou tweet media
English
0
1
11
527
Yu-Cheng Chou
Yu-Cheng Chou@johnson111788·
We also built OpenSafari, a dataset designed to break existing models: • in-the-wild FPV drone videos • large-scale camera motion • strong parallax & elevation changes Every trajectory is geometrically verified.
Yu-Cheng Chou tweet media
English
0
0
9
492
Yu-Cheng Chou
Yu-Cheng Chou@johnson111788·
The difference is most visible under aggressive motion: • sharp turns • large parallax • long trajectories Baselines collapse, ours stays consistent.
English
0
0
11
541
Yu-Cheng Chou
Yu-Cheng Chou@johnson111788·
Our idea: build a world memory. At each frame, the model retrieves 3D-consistent information conditioned on the camera pose. → This enables stable generation under 6-DoF motion → Even for long trajectories in complex outdoor scenes
Yu-Cheng Chou tweet media
English
0
0
11
625
Yu-Cheng Chou
Yu-Cheng Chou@johnson111788·
But in reality, most video generation models today: ❌ Only work on narrow domains (e.g., real estate scenes) ❌ Break as soon as the camera moves ❌ Fail to follow the camera trajectory The core issue? → No persistent world representation
Yu-Cheng Chou tweet media
English
0
0
9
704
Yu-Cheng Chou
Yu-Cheng Chou@johnson111788·
@CVPR Can we put the title in the main text to save space?
English
0
0
7
1.6K
Yu-Cheng Chou retweetledi
Wufei Ma
Wufei Ma@wufeima·
Join us at #ICCV2025 for the 1st Embodied Spatial Reasoning Workshop! We're thrilled to host amazing speakers from industry and academia, featuring Sifei Liu, @xiaolonw, @xf1280, and @kate_saenko_, to discuss frontiers of spatial reasoning, embodied agents, and robotics! 🔗 tinyurl.com/yn7b6mu6
Wufei Ma tweet media
English
2
19
95
10K
Yu-Cheng Chou
Yu-Cheng Chou@johnson111788·
Deep gratitude to our advisors @cihangxie and @never1andd for their guidance and support throughout this work. Thank you! 🙏
English
0
0
1
173