Huaijin Pi

14 posts

Huaijin Pi

Huaijin Pi

@HuaijinPi

Ph.D. student at the University of Hong Kong

Katılım Mart 2022
101 Takip Edilen66 Takipçiler
Huaijin Pi retweetledi
Wenjia Wang
Wenjia Wang@WenjiaWang_HKU·
🚀 Excited to share our #CVPR2026 paper: EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents. EmbodMocap, a portable yet affordable solution requiring only two moving iPhones—no calibrated multi-view camera studio, motion capture suits, or LiDAR sensors needed. With our fully automated optimization pipeline, you can effortlessly obtain high-precision scene meshes, human interaction motions, RGBD images, and camera parameters. The captured data is ready for training human-scene reconstruction models (like TRAM, pi3, etc.) and humanoid control policies (like deepmimic, AMP, etc.). What you need to do: 1. Borrow or buy two iPhone 12 Pros from eBay (600 USD in total). 2. Find 2 friends, then capture the sequences. 3. Deploy our repo, run our code, and get the results! The code and data will be released within 1 week. (Just come back to work from the Chinese Spring Festival, Happy Chinese New Year!) 📷 Project page: wenjiawang0312.github.io/projects/embod… 📷ArXiv: arxiv.org/abs/2602.23205 📷Code: github.com/WenjiaWang0312…
English
10
47
284
16.7K
Huaijin Pi retweetledi
Huaijin Pi retweetledi
Zhiyang (Frank) Dou
Zhiyang (Frank) Dou@frankzydou·
Please check out paper #MOSPA "🎧Human Motion Generation Driven by Spatial Audio” at #NeurIPS2025 (🌟Spotlight)! 😊We have released our dataset and models : ) 💡The paper tackles the challenge of spatial-audio-driven human motion generation, enabling virtual humans to respond dynamically and realistically to diverse spatial sounds — not just “what” is sounding, but also “where” and “how” it sounds in space. 💡We introduce SAM, the first comprehensive Spatial Audio-Driven Human Motion dataset, with diverse spatial audio scenarios and high-quality 3D motion pairs, providing a solid benchmark for studying human motion conditioned on spatial audio. 💡Building on this, MOSPA is a diffusion-based generative framework that fuses semantic and spatial features of the audio to synthesize diverse, realistic motions aligned with spatial audio cues, achieving state-of-the-art performance on this new task and offering a strong baseline for future research. If you work on virtual humans, spatial audio, XR, or humanoid / embodied control, this can be a good motion skill learning source. Please come meet the team at our #NeurIPS2025 San Diego Spotlight poster! 📍 Exhibit Hall C,D,E — #4310 🕚 Fri, Dec 5 | 11 a.m.–2 p.m. PST Homepage: frank-zy-dou.github.io/projects/MOSPA… Paper: arxiv.org/abs/2507.11949 Code and Data: github.com/xsy27/Mospa-Ac… #NeurIPS #NeurIPS2025 #MOSPA #motion #Animation #SpatialAudio #VirtualHuman #Robotics #Robot #AI #Deeplearning #GenerativeAI #AIGC
Zhiyang (Frank) Dou tweet media
Zhiyang (Frank) Dou@frankzydou

Excited to share our latest work on 🎧spatial audio-driven human motion generation. We aim to tackle a largely underexplored yet important problem of enabling virtual humans to move naturally in response to spatial audio—capturing not just what is heard, but also where the sound is coming from. To this end, we introduce the Spatial Audio-Driven Human Motion (SAM) dataset—the first comprehensive dataset featuring paired high-quality human motion and spatial audio recordings. For benchmarking, we develop a generative framework for human MOtion generation driven by SPAtial audio, termed MOSPA, which learns to synthesize realistic and diverse human motions conditioned on spatial audio input. We hope this research could provide a foundation for future research in spatial perception, virtual characters, and embodied AI. The dataset and model will be open-sourced soon. A big thank you to our intern, Shuyang Xu, for the wonderful collaboration! Congratulations, Shuyang! Project page: frank-zy-dou.github.io/projects/MOSPA… Paper: arxiv.org/abs/2507.11949 Video: youtu.be/p_xwTDA-K0g #Animation #CG #CV #AIGC #DL #Deeplearning #Motion #Graphics #AI #GenerativeAI

English
0
9
19
4.7K
Huaijin Pi
Huaijin Pi@HuaijinPi·
Come meet us at San Diego Poster! 🎉 📍 Exhibit Hall C,D,E — #5207 🕚 Wed, Dec 3 | 11 a.m.–2 p.m. PST Huge thanks to my amazing collaborators: Zhi Cen, @frankzydou, and, Taku Komura
English
0
0
1
165
Huaijin Pi
Huaijin Pi@HuaijinPi·
CoDA is not limited to articulated objects — it also supports rigid object manipulation, producing stable, coordinated whole-body motions driven purely by text.
English
1
0
0
188
Zhen Xu
Zhen Xu@realzhenxu·
Introducing the 4D Gaussian Transformer, a feed-forward dynamic Gaussian reconstructor trained only on monocular videos, achieving performance on par with optimization-based methods at a fraction of the inference time! Paper: arxiv.org/abs/2506.08015 Page: 4dgt.github.io
English
1
8
69
17.5K
Zhen Xu
Zhen Xu@realzhenxu·
Tired of short multi-view video datasets? Check out our new SelfCap dataset with up to 10 minutes of 24-cameras high-quality dense view recording at 4K resolution! Data released for our Long Volumetric Video paper: zju3dv.github.io/longvolcap Dataset link: forms.gle/MzJqZjBfyZ53fR…
English
4
28
178
13.5K