Zhengming Yu

24 posts

Zhengming Yu

@Proof_Yu

CS Ph.D. student in Texas A&M University

Katılım Ocak 2017

272 Takip Edilen80 Takipçiler

Zhengming Yu retweetledi

Zhiyang (Frank) Dou@frankzydou·11 May

Introducing ✨RigidFormer: Learning Rigid Dynamics with Transformers - our attempt to scale learning-based physical dynamics with Transformers. RigidFormer learns rigid dynamics with Transformers. It is a mesh-free, object-centric Transformer for multi-object rigid-body contact dynamics from point clouds. Learning physics with purely neural simulators, without relying on traditional physics engines, is an important and widely studied problem. Prior SOTA methods often use graph neural networks for accuracy and generalization, but still struggle with efficient, high-fidelity simulation at scale. RigidFormer uses only point inputs, matches or outperforms mesh-based baselines on standard benchmarks, runs much faster, generalizes across point resolutions and datasets, and scales to 200+ objects. We also show a preliminary extension to command-conditioned articulated bodies by treating body parts as interacting object-level components. RigidFormer is mesh-free: it does not require mesh connectivity, SDFs, or vertex-level message passing, making it well-suited for point-cloud observations and scalable simulation. This architecture can also be adapted to learn soft-body dynamics by replacing the rigid-body module (differentiable Kabsch alignment). 🎬See our video for more details. Many thanks to my amazing collaborators: Minghao Guo @GuoMh14, Haixu Wu @Haixu_Wu_1998, Doug Roble, Tuur Stuyck @TuurStuyck, and Wojciech Matusik @wojmatusik. Project page: people.csail.mit.edu/frankzydou/pro… Paper: people.csail.mit.edu/frankzydou/pro…

English

295

566.5K

Zhengming Yu retweetledi

Zhiyang (Frank) Dou@frankzydou·1 May

Excited to share that our work NeuralActuator: Neural Actuation Modeling for Robot Dynamics and External Force Perception has been accepted to #RSS2026! Your robot — even a low-cost one — can feel external forces without torque or tactile sensors. TL;DR: NeuralActuator is a neural actuator model that jointly predicts 1️⃣torque to capture the nonlinear and time-varying current–to–torque relationship of low-cost servos, 2️⃣external contact forces (and force detection gates) for sensorless force perception, 3️⃣and motor conditions that indicate each motor’s operating regime. Here is a fast-forward video clip ⬇️ We are also covering more robots like LeRobot-S101 and Franka Panda. More details coming soon.

English

326

39.5K

Zhengming Yu retweetledi

Zhiyang (Frank) Dou@frankzydou·27 Nis

SAD: Soft Anisotropic Diagrams for Differentiable Image Representation has been accepted by #SIGGRAPH2026 Check it out, and huge congrats to Lucky! @Luckyballa #SAD represents an image as a soft, anisotropic, differentiable diagram over learnable sites. Each pixel is modeled as a softmax blend over its top-K nearby sites under a site-dependent distance, yielding a differentiable partition of unity with explicit ownership and content-aligned boundaries. A GPU-friendly top-K propagation scheme keeps the cost constant per pixel, enabling fast fitting at matched or better quality. Classical geometric structures can still inspire fresh perspectives in modern visual computing. Voronoi and Power diagrams have long been elegant tools for 3D shape analysis, reconstruction, and geometric reasoning; here, related diagram ideas, with connections to Apollonius-style diagrams, are explored for image representations. Homepage: luckyiyi.github.io/SAD/ arXiv: arxiv.org/pdf/2604.21984 #SIGGRAPH2026 #SIGGRAPH #CV #Vision #Graphics #CG

Lucky Iyinbor@Luckyballa

This January, I decided to give it a shot and wrote my first paper Today, I am happy to share that it was accepted by #SIGGRAPH2026 SAD is a differentiable image representation with soft, anisotropic partitioning, with up to 20x faster encoding time🧵 luckyiyi.github.io/SAD/index.html

Cambridge, MA 🇺🇸 English

3.5K

Zhengming Yu@Proof_Yu·23 Mar

@HavenFeng Good point lol🤣

English

Haven Feng@HavenFeng·22 Mar

“World model” is such an overloaded term now. Seriously, until when will we start considering a terrestrial globe 🌍 as a world model? It’s clearly about the world, has 3D consistency, and very persistent memory (no matter how many times you rotate it) 🤣

Xun Huang@xxunhuang

True. Static 3DGS is not a world model. A world model needs to understand action and reaction, cause and effet.

English

6.8K

Zhengming Yu retweetledi

Zhiyang (Frank) Dou@frankzydou·18 Mar

We have seen many works unlock the power of pretrained models for images and videos🏞️. But what about human motion🕺💃? Can we leverage a pretrained motion prior for a wide range of downstream tasks? Yes!! UMO is a simple yet effective framework that, for the first time, unlocks the priors of a motion foundation model (i.e., HY-Motion) for 10+ tasks, including editing, reaction generation, stylization, trajectory control, obstacle avoidance, keyframe infilling, and more. Amazing work! @xiaoyan_cong and @kunkun0w0. 🏠Webpage: oliver-cong02.github.io/UMO.github.io/ 📄 Paper: arxiv.org/abs/2603.15975 With the growing number of tools for transferring SMPL motion to humanoids, we hope it could also become a source of skills for humanoid robot learning. #Graphics #Motion #Animation #AIGC #GenerativeAI #Vision #3DV #Robotics #Robot #Humanoid #Learning #GenAI #Animation

Xiaoyan Cong@xiaoyan_cong

💡Introducing 𝑼𝑴𝑶 -- one unified model that unlocks motion foundation model (HY-Motion @TencentHunyuan) priors for 𝟏𝟎+ 𝐭𝐚𝐬𝐤𝐬: 𝐞𝐝𝐢𝐭𝐢𝐧𝐠, 𝐫𝐞𝐚𝐜𝐭𝐢𝐨𝐧 𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧, 𝐬𝐭𝐲𝐥𝐢𝐳𝐚𝐭𝐢𝐨𝐧, 𝐭𝐫𝐚𝐣𝐞𝐜𝐭𝐨𝐫𝐲 𝐜𝐨𝐧𝐭𝐫𝐨𝐥, 𝐨𝐛𝐬𝐭𝐚𝐜𝐥𝐞 𝐚𝐯𝐨𝐢𝐝𝐚𝐧𝐜𝐞, 𝐤𝐞𝐲𝐟𝐫𝐚𝐦𝐞 𝐢𝐧𝐟𝐢𝐥𝐥𝐢𝐧𝐠... (1/8) 🌐 Webpage: oliver-cong02.github.io/UMO.github.io/ 📄 Paper: arxiv.org/abs/2603.15975

English

8.8K

Zhengming Yu retweetledi

Jingdong Zhang@jdzhang0929·14 Ara

🎉 Excited to share our SIGGRAPH Asia 2025 paper: SPGen! We propose Spherical Projection (SP) as a consistent and flexible representation for Single Image to 3D mesh generation. 🌍➡️🧊 #SIGGRAPHAsia2025 #3DGeneration #GenerativeAI #ComputerVision #DiffusionModels

English

4.3K

Zhengming Yu@Proof_Yu·15 Ağu

@QianqianWang5 Congrats!

English

400

Qianqian Wang@QianqianWang5·15 Ağu

📢Thrilled to share that I'll be joining Harvard and the Kempner Institute as an Assistant Professor starting Fall 2026! I'll be recruiting students this year for the Fall 2026 admissions cycle. Hope you apply!

Kempner Institute at Harvard University@KempnerInst

We are thrilled to share the appointment of @QianqianWang5 as an #KempnerInstitute Investigator! She will bring her expertise in computer vision to @Harvard. Read the announcement: bit.ly/4mIghHy @hseas #AI #ComputerVision

English

101

748

112.4K

Zhengming Yu retweetledi

Tianye Li@_TianyeLi·14 Ağu

Learned from in-the-wild images, GAIA generates 3D Gaussian animatable avatars with identity & expression control and real-time animation 📅 Talk: Aug 14, 9 AM PDT – West Bldg, Rm 211–214 🎮 Live demo @ poster session & E-Tech 🔗research.nvidia.com/labs/amri/proj… #SIGGRAPH2025 @NVIDIAAI

English

14.9K

Zhengming Yu retweetledi

Zhiyang (Frank) Dou@frankzydou·23 Nis

Hello 🇸🇬 Singapore! At #ICLR2025, I’ll be presenting our work 🎲DICE from @LingjieLiu1's lab! With DICE, one can explore hand-face interactions 📷 — this feedforward method simultaneously estimates hand and face poses, contact points, and deformations from a single image using a Transformer-based architecture. Come join us! 📍 Hall 3 + Hall 2B #130 Poster Session 6 🕒 Sat 26 Apr, 3–5:30 p.m. 🎥 Check out the video for more details! Huge thanks to all our amazing collaborators who made this possible: @qingxuan_wu @xu_sirui , Soshi Shimada, @chenwangcw , @Proof_Yu , @YuanLiu41955461 , @_cheng_lin , Zeyu Cao, Taku Komura, @VGolyanik , Christian Theobalt, Wenping Wang, and @LingjieLiu1. #ICLR25 #ICLR2025 #AI #Animation #CV #CG #Interaction

Zhiyang (Frank) Dou@frankzydou

🔍 Check out our latest research on 3D hand-face interactions! Please check the video🎥! 🔥 Introducing 🎲 ᗪIᑕE, the first end-to-end method that captures hand-face interactions and deformations from a single image. 🎯 Our method achieves state-of-the-art accuracy while reconstructing plausible interactions and deformations. 🚀 In addition, it runs blazingly fast with a 20 fps speed on a 4090 GPU, enabling various downstream applications including AR/VR, character animation, and human behavior analysis. Paper: arxiv.org/abs/2406.17988 Code: github.com/Qingxuan-Wu/DI… Project page: frank-zy-dou.github.io/projects/DICE/… Great job @qingxuan_wu! #Animation #AR #VR #XR #AI #CV #3DV #AIGC #Human #Interaction #Reconstruction

Plentong, Johor 🇲🇾 English

8.1K

Zhengming Yu retweetledi

MrNeRF@janusch_patas·8 Nis

3R-GS: Best Practice in Optimizing Camera Poses Along with 3DGS Contributions: 1. We propose 3R-GS, a robust method for reconstructing high-quality 3D Gaussians and poses from the MASt3R's imperfect output cameras. 2. Identifying two main challenges in bundle-adjusting 3DGS, we propose an effective solution that combines 3DGS-MCMC, an MLP-based pose refiner, and an epipolar distance loss to address these issues. 3. Our experiments demonstrate the superior performance of 3R-GS in both novel view synthesis and camera pose estimation.

English

185

11K

Zhengming Yu retweetledi

Zhiyang (Frank) Dou@frankzydou·23 Oca

DICE 🎲was accepted by #ICLR2025. With DICE, one can learn more about hand face interactions 👤🤏. This end-to-end method also enables better scalability to learn and model hand-face interaction. Congrats to @qingxuan_wu ! Check the video for more details :) #ICLR #ICLR2025

Zhiyang (Frank) Dou@frankzydou

English

5.1K

Zhengming Yu retweetledi

Zhenjun Zhao@zhenjun_zhao·23 Ara

SolidGS: Consolidating Gaussian Surfel Splatting for Sparse-View Surface Reconstruction Zhuowen Shen, @YuanLiu41955461, Zhang Chen, @tom44409897, @w080707, Yongqing Liang, @Proof_Yu, @jdzhang0929, Yi Xu, Scott Schaefer, Xin Li, Wenping Wang arxiv.org/abs/2412.15400

English

4.1K

Zhengming Yu@Proof_Yu·7 Ara

@JingxiangSun42 NB! Congratulations!!

English

Jingxiang Sun@JingxiangSun42·7 Ara

Reached 1000 citations🎉 A small milestone for me🤓

English

1.6K

Zhengming Yu retweetledi

Zhiyang (Frank) Dou@frankzydou·22 Kas

#SIGGRAPHASIA #SIGGRAPHASIA2024 #ACMTOG 🐟 We introduce Collective Behavior Imitation Learning (CBIL), a scalable, self-supervised framework for learning fish schooling behaviors from videos, to be presented at SIGGRAPH ASIA 2024 Tokyo (journal track)🗼! 🦈Reproducing realistic collective behaviors presents a captivating challenge, as traditional rule-based methods fall short in realism, while data-driven approaches rely on hard-to-acquire motion trajectories. 🐠CBIL first leverages a Masked Video AutoEncoder (MVAE) to map 2D observations to expressive latent states. Then an adversarial imitation learning framework with bio-inspired rewards is developed for stable and realistic motion generation. We demonstrate CBIL's effectiveness across various fish body shapes and its capability to detect abnormal behaviors from in-the-wild videos (real2sim4real). I like this attempt to “inject data priors” for Visual Imitation Learning, especially given the challenges of obtaining ground truth 3D motion for imitation. A heartfelt thanks to our amazing intern, Yifan Wu @Littlecobbler! Always remember the excitement we felt during those sleepless nights! And special gratitude to our incredible collaborators: Yuko Ishiwaka, Shun Ogawa, Yuke Lou, Wenping Wang, @LingjieLiu1, and Taku Komura. 🏠Project Page: frank-zy-dou.github.io/projects/CBIL/… 📑Paper: ACM Transactions on Graphics dl.acm.org/doi/10.1145/36… #AI #ImitationLearning #Animation #Animation #AI #CrowdAnimation #BehaviorAnalysis #CrowdMotion #Graphics #CG #Motion #AIGC #MotionSynthesis We showcase real-world videos and synthesized results, aligned for clearer visualization (rather than as reconstructions).

Philadelphia, PA 🇺🇸 English

9.4K

Zhengming Yu retweetledi

Jionghao Wang@ShaneMankiw·3 Eki

Come and check out SO-SMPL at poster 301, tomorrow (Friday, Oct. 4th) during 10:30 - 12:30! Code is also released here: github.com/shanemankiw/SO…

Jionghao Wang@ShaneMankiw

Want to generate animation-ready 3D avatars with disentangled clothes, from just text descriptions? Introducing our new work featuring a simple yet effective representation, SO(Sequentially Offset)-SMPL! (1/4) arxiv: arxiv.org/abs/2312.05295 project: shanemankiw.github.io/SO-SMPL/

English

4.3K

Zhengming Yu@Proof_Yu·2 Eki

Surf-D will be posted today. Please check out with our handsome boy @frankzydou if you're at ECCV 2024 @eccvconf!! 👗 Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models 📅 Wed, Oct 2 | 16:30 - 18:30 | Poster Session 4 | Poster 285

Zhengming Yu@Proof_Yu

Our Surf-D is accepted at ECCV 2024 @eccvconf. Codes are released. Surf-D can generate arbitrary typology shapes in high resolution using UDF representation. Project page: yzmblog.github.io/projects/SurfD/ Arxiv: arxiv.org/abs/2311.17050 Codes: github.com/Yzmblog/SurfD #eccv #SurfD

English

3.2K

Zhengming Yu retweetledi

Qingxuan Wu@qingxuan_wu·7 Tem

🔍 Check out our latest research on 3D hand-face interactions! 🔥 Introducing 🎲 DICE, the first end-to-end method that captures hand-face interactions and deformations from a single image. (1/n)

English

27.5K

Zhengming Yu@Proof_Yu·3 Tem

English

14.6K

Zhengming Yu retweetledi

Zhiyang (Frank) Dou@frankzydou·1 Tem

Got five papers accepted by #ECCV2024 @eccvconf ! Huge thanks to all my collaborators! 😃 See you in Milan 🇮🇹 Summary of Selected Works (I made a fast-forward for them 😄) - [Shape Generation] Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models, ECCV 2024. - [Efficient Motion Generation] EMDM: Efficient Motion Diffusion Model for Fast, High-Quality Human Motion Generation, ECCV 2024. - [Controllable Motion Generation] TLControl: Trajectory and Language Control for Human Motion Synthesis, ECCV 2024. - [Avatar Generation] Disentangled Clothed Avatar Generation from Text Descriptions, ECCV 2024. Project Page: Surf-D: yzmblog.github.io/projects/SurfD/ EMDM: frank-zy-dou.github.io/projects/EMDM/… TLControl: tlcontrol.weilinwl.com SOSMPL: shanemankiw.github.io/SO-SMPL/

English

151

18.1K

Zhengming Yu@Proof_Yu·29 Ara

Real time text-to-motion with physical plausibility. Super cool!

Zhiyang (Frank) Dou@frankzydou

🔥Yes! You can achieve REAL-TIME text-to-motion generation using a simulated humanoid to perform various skills! This feat is realized through the integration of PHC and EMDM. 💬This combination addresses two pivotal challenges in human motion synthesis: ensuring physical plausibility and enhancing motion generation efficiency. Hats off to @zhengyiluo for the fantastic work! ☺️ PHC: github.com/ZhengyiLuo/Per… EMDM: frank-zy-dou.github.io/projects/EMDM/… #AIGC #Animation #Character #Generation #3D #AI #Motion #SMPL #Graphics #SOSMPL #IsaacGym #Simulation

English

364

Keşfet

@GuoMh14 @Haixu_Wu_1998 @TuurStuyck @wojmatusik @Luckyballa @HavenFeng @xiaoyan_cong @kunkun0w0