Ruizhi Shao
61 posts

Ruizhi Shao
@RZ_Shao
Building True Intelligence in Real World @Rhoda_AI_ Prev. PhD @Tsinghua_uni

After operating in stealth for the last 18 months @rhodaai , we’re excited today to finally show the world what we’ve been working on. We believe we’re on a path to physical AGI with the launch of our brand new foundation model, the Direct Video Action (DVA) model.


Video generation of humans with control over body pose and facial expressions is crucial for a plethora of applications. Towards this goal, we introduce a new interspatial attention (ISA) mechanism as a scalable building block for DiT–based video generation models #SIGGRAPH2025

Introduce HumanPlus - Shadowing part Humanoids are born for using human data. We build a real-time shadowing system using a single RGB camera and a whole-body policy for cloning human motion. Examples: - boxing🥊 - playing the piano🎹/ping pong - tossing - typing Open-sourced!

Introducing Proteus 0.1, REAL-TIME video generation that brings life to your AI. Proteus can laugh, rap, sing, blink, smile, talk, and more. From a single image! Come meet Proteus on Twitch in real-time. ↓ Sign up for API waitlist: apparate.ai/early-access.h… 1/11


Collaborative Video Diffusion Consistent Multi-video Generation with Camera Control Research on video generation has recently made tremendous progress, enabling high-quality videos to be generated from text prompts or images. Adding control to the video generation

Tele-Aloha A Low-budget and High-authenticity Telepresence System Using Sparse RGB Cameras In this paper, we present a low-budget and high-authenticity bidirectional telepresence system, Tele-Aloha, targeting peer-to-peer communication scenarios. Compared to previous systems,









