Huaijin Pi

14 posts

Huaijin Pi

@HuaijinPi

Ph.D. student at the University of Hong Kong

Katılım Mart 2022

101 Takip Edilen66 Takipçiler

Sabitlenmiş Tweet

Huaijin Pi@HuaijinPi·30 Kas

🚀 Excited to share our NeurIPS 2025 paper: CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects 🔗 Project page: phj128.github.io/page/CoDA/inde… 🔗 Code: github.com/phj128/CoDA 🔗 Paper: arxiv.org/abs/2505.21437

English

19.9K

Huaijin Pi retweetledi

Wenjia Wang@WenjiaWang_HKU·27 Şub

🚀 Excited to share our #CVPR2026 paper: EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents. EmbodMocap, a portable yet affordable solution requiring only two moving iPhones—no calibrated multi-view camera studio, motion capture suits, or LiDAR sensors needed. With our fully automated optimization pipeline, you can effortlessly obtain high-precision scene meshes, human interaction motions, RGBD images, and camera parameters. The captured data is ready for training human-scene reconstruction models (like TRAM, pi3, etc.) and humanoid control policies (like deepmimic, AMP, etc.). What you need to do: 1. Borrow or buy two iPhone 12 Pros from eBay (600 USD in total). 2. Find 2 friends, then capture the sequences. 3. Deploy our repo, run our code, and get the results! The code and data will be released within 1 week. (Just come back to work from the Chinese Spring Festival, Happy Chinese New Year!) 📷 Project page: wenjiawang0312.github.io/projects/embod… 📷ArXiv: arxiv.org/abs/2602.23205 📷Code: github.com/WenjiaWang0312…

English

284

16.7K

Huaijin Pi retweetledi

Qing Shuai@chingswy·31 Ara

Work hard for this project!

Tencent Hy@TencentHunyuan

✨We are excited to open-source Tencent HY-Motion 1.0, a billion-parameter text-to-motion model built on the Diffusion Transformer (DiT) architecture and flow matching. Tencent HY-Motion 1.0 empowers developers and individual creators alike by transforming natural language into high-fidelity, fluid, and diverse 3D character animations, delivering exceptional instruction-following capabilities across a broad range of categories. The generated 3D animation assets can be seamlessly integrated into typical 3D animation pipelines.🎮🎥 Highlights: 🔹Billion-Scale DiT: Successfully scaled flow-matching DiT to 1B+ parameters, setting a new ceiling for instruction-following capability and generated motion quality. 🔹Full-Stage Training Strategy: The industry’s first motion generation model featuring a complete Pre-training → SFT → RL loop to optimize physical plausibility and semantic accuracy. 🔹Comprehensive Category Coverage: Features 200+ motion categories across 6 major classes—the most comprehensive in the industry, curated via a meticulous data pipeline. 🌐Project Page: hunyuan.tencent.com/motion 🔗Github: github.com/Tencent-Hunyua… 🤗Hugging Face: huggingface.co/tencent/HY-Mot… 📄Technical report: arxiv.org/pdf/2512.23464

English

Huaijin Pi retweetledi

Zhiyang (Frank) Dou@frankzydou·3 Ara

Please check out paper #MOSPA "🎧Human Motion Generation Driven by Spatial Audio” at #NeurIPS2025 (🌟Spotlight)! 😊We have released our dataset and models : ) 💡The paper tackles the challenge of spatial-audio-driven human motion generation, enabling virtual humans to respond dynamically and realistically to diverse spatial sounds — not just “what” is sounding, but also “where” and “how” it sounds in space. 💡We introduce SAM, the first comprehensive Spatial Audio-Driven Human Motion dataset, with diverse spatial audio scenarios and high-quality 3D motion pairs, providing a solid benchmark for studying human motion conditioned on spatial audio. 💡Building on this, MOSPA is a diffusion-based generative framework that fuses semantic and spatial features of the audio to synthesize diverse, realistic motions aligned with spatial audio cues, achieving state-of-the-art performance on this new task and offering a strong baseline for future research. If you work on virtual humans, spatial audio, XR, or humanoid / embodied control, this can be a good motion skill learning source. Please come meet the team at our #NeurIPS2025 San Diego Spotlight poster! 📍 Exhibit Hall C,D,E — #4310 🕚 Fri, Dec 5 | 11 a.m.–2 p.m. PST Homepage: frank-zy-dou.github.io/projects/MOSPA… Paper: arxiv.org/abs/2507.11949 Code and Data: github.com/xsy27/Mospa-Ac… #NeurIPS #NeurIPS2025 #MOSPA #motion #Animation #SpatialAudio #VirtualHuman #Robotics #Robot #AI #Deeplearning #GenerativeAI #AIGC

Zhiyang (Frank) Dou@frankzydou

Excited to share our latest work on 🎧spatial audio-driven human motion generation. We aim to tackle a largely underexplored yet important problem of enabling virtual humans to move naturally in response to spatial audio—capturing not just what is heard, but also where the sound is coming from. To this end, we introduce the Spatial Audio-Driven Human Motion (SAM) dataset—the first comprehensive dataset featuring paired high-quality human motion and spatial audio recordings. For benchmarking, we develop a generative framework for human MOtion generation driven by SPAtial audio, termed MOSPA, which learns to synthesize realistic and diverse human motions conditioned on spatial audio input. We hope this research could provide a foundation for future research in spatial perception, virtual characters, and embodied AI. The dataset and model will be open-sourced soon. A big thank you to our intern, Shuyang Xu, for the wonderful collaboration! Congratulations, Shuyang! Project page: frank-zy-dou.github.io/projects/MOSPA… Paper: arxiv.org/abs/2507.11949 Video: youtu.be/p_xwTDA-K0g #Animation #CG #CV #AIGC #DL #Deeplearning #Motion #Graphics #AI #GenerativeAI

English

4.7K

Huaijin Pi@HuaijinPi·30 Kas

@WenjiaWang_HKU Thanks! 😊

English

107

Wenjia Wang@WenjiaWang_HKU·30 Kas

@HuaijinPi Great work!

English

110

Huaijin Pi@HuaijinPi·30 Kas

English

19.9K

Huaijin Pi@HuaijinPi·30 Kas

Come meet us at San Diego Poster! 🎉 📍 Exhibit Hall C,D,E — #5207 🕚 Wed, Dec 3 | 11 a.m.–2 p.m. PST Huge thanks to my amazing collaborators: Zhi Cen, @frankzydou, and, Taku Komura

English

165

Huaijin Pi@HuaijinPi·30 Kas

CoDA is not limited to articulated objects — it also supports rigid object manipulation, producing stable, coordinated whole-body motions driven purely by text.

English

188

Huaijin Pi@HuaijinPi·10 Haz

@realzhenxu Awesome work

English

261

Zhen Xu@realzhenxu·10 Haz

Introducing the 4D Gaussian Transformer, a feed-forward dynamic Gaussian reconstructor trained only on monocular videos, achieving performance on par with optimization-based methods at a fraction of the inference time! Paper: arxiv.org/abs/2506.08015 Page: 4dgt.github.io

English

17.5K

Huaijin Pi@HuaijinPi·13 Oca

@realzhenxu Great work.

English

161

Zhen Xu@realzhenxu·13 Oca

Tired of short multi-view video datasets? Check out our new SelfCap dataset with up to 10 minutes of 24-cameras high-quality dense view recording at 4K resolution! Data released for our Long Volumetric Video paper: zju3dv.github.io/longvolcap Dataset link: forms.gle/MzJqZjBfyZ53fR…

English

178

13.5K

Keşfet

@WenjiaWang_HKU @frankzydou @realzhenxu @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates