Ruizhi Shao

61 posts

Ruizhi Shao banner
Ruizhi Shao

Ruizhi Shao

@RZ_Shao

Building True Intelligence in Real World @Rhoda_AI_ Prev. PhD @Tsinghua_uni

Beijing, China Katılım Haziran 2023
196 Takip Edilen198 Takipçiler
Ruizhi Shao retweetledi
Rhoda AI
Rhoda AI@RhodaAI·
Most robot demos are “golden runs”: a perfect take selected from many attempts. But real-world deployment is about Continuous Operation. Watch our DVA model tackle a real-world decanting task for 1.5 hours straight: Uncut, Zero human intervention. 🧵👇
English
4
10
45
4.1K
Rhoda AI
Rhoda AI@RhodaAI·
The gap between robotics in the lab and robotics in the real world has been one of the hardest unsolved problems in the industry. We’re excited to come out of stealth and show the research community how we’re tackling the issue. Bloomberg article in comment.
Rhoda AI tweet media
English
13
13
90
19.9K
Ruizhi Shao
Ruizhi Shao@RZ_Shao·
As robots tackle increasingly general tasks, scaling robot action data has become a critical bottleneck. Our proposed DVA pioneers a new path, learning physical world and experience from tons of videos, transfering across any robotic platform and task. Super cool direction!
Jagdeep Singh@startupjag

After operating in stealth for the last 18 months @rhodaai , we’re excited today to finally show the world what we’ve been working on. We believe we’re on a path to physical AGI with the launch of our brand new foundation model, the Direct Video Action (DVA) model.

English
0
0
5
360
Ruizhi Shao
Ruizhi Shao@RZ_Shao·
@rhodaai Thrilled by the future of Physical AI! My journey in Rhoda has strengthened my conviction in our path toward generalist robotics. Onward to making true intelligence a reality.
English
0
0
2
65
Ruizhi Shao retweetledi
Rhoda AI
Rhoda AI@RhodaAI·
03.10.26
15
33
249
48.1K
Ruizhi Shao retweetledi
Rhoda AI
Rhoda AI@RhodaAI·
ZXX
7
31
147
25.9K
Ruizhi Shao retweetledi
Ruizhi Shao retweetledi
Ruilong Li
Ruilong Li@ruilong_li·
Excited to announce 🚀gsplat v1.0🚀: a ⏩efficient⏩ CUDA backend for 3D Gaussian Splatting! docs.gsplat.studio A drop-in replacement of the official impl. with: - Up to 2x faster training; - Up to 4x less GPU memory; - Render millions of GSs in real-time; - And more;
English
10
106
523
87.7K
Prime (Shengqu) Cai
Prime (Shengqu) Cai@prime_cai·
We've been exploring a fun and challenging idea: combining top-notch video generation models with multi-view capabilities. This seems like a natural next step after all the amazing video and multi-image generation models. The question was, could we create a multi-view video generation model without needing a 4D dataset? 🤔 Check out our latest effort, where we've made a first attempt into crafting multi-view, or more precisely multi-trajectory videos that maintain consistent underlying dynamic content — all without the need for 4D data! 🌐Website: collaborativevideodiffusion.github.io 📄Paper: arxiv.org/abs/2405.17414 👾Code: github.com/CollaborativeV… Thanks @_akhaliq for sharing!
AK@_akhaliq

Collaborative Video Diffusion Consistent Multi-video Generation with Camera Control Research on video generation has recently made tremendous progress, enabling high-quality videos to be generated from text prompts or images. Adding control to the video generation

English
4
18
93
25.7K
Ruizhi Shao retweetledi
AK
AK@_akhaliq·
COLMAP-Free 3D Gaussian Splatting paper page: huggingface.co/papers/2312.07… While neural rendering has led to impressive advances in scene reconstruction and novel view synthesis, it relies heavily on accurately pre-computed camera poses. To relax this constraint, multiple efforts have been made to train Neural Radiance Fields (NeRFs) without pre-processed camera poses. However, the implicit representations of NeRFs provide extra challenges to optimize the 3D structure and camera poses at the same time. On the other hand, the recently proposed 3D Gaussian Splatting provides new opportunities given its explicit point cloud representations. This paper leverages both the explicit geometric representation and the continuity of the input video stream to perform novel view synthesis without any SfM preprocessing. We process the input frames in a sequential manner and progressively grow the 3D Gaussians set by taking one input frame at a time, without the need to pre-compute the camera poses. Our method significantly improves over previous approaches in view synthesis and camera pose estimation under large motion changes.
English
1
50
278
63K
Ruizhi Shao retweetledi
Linus ✦ Ekenstam
Linus ✦ Ekenstam@LinusEkenstam·
Relightable Real-Time Avatars Meta Codec Avatars 2.0 gets an update, building on 3D Gaussian Splatting from Meta. Accuracy is down to the human hair strand level 🔬 🧵 A thread
English
37
331
1.8K
351.9K
Ruizhi Shao retweetledi
Anton Obukhov
Anton Obukhov@AntonObukhov1·
Introducing Marigold 🌼 - a universal monocular depth estimator, delivering incredibly sharp predictions in the wild! Based on Stable Diffusion, it is trained with synthetic depth data only and excels in zero-shot adaptation to real-world imagery. Check it out: 🌐 Website: marigoldmonodepth.github.io 🤗 Hugging Face Space: huggingface.co/spaces/toshas/… 📄 Paper: arxiv.org/abs/2312.02145 👾 Code: github.com/prs-eth/marigo… The team: Bingxin Ke (@KBingxin), yours truly (@AntonObukhov1), Shengyu Huang (@ShengyHuang), Nando Metzger (@NandoMetzger), Rodrigo Caye Daudt (@rcdaudt), and Konrad Schindler. #ComputerVision #PRS #ETHZurich
English
43
267
1.4K
489.6K
Ruizhi Shao retweetledi
Hong-Xing (Koven) Yu
Hong-Xing (Koven) Yu@Koven_Yu·
Can generative AI imagine what Alice saw in her journey in the Wonderland 🏞️🚶‍♀️? Introducing WonderJourney: Create a journey (a long sequence of diverse yet connected 3D scenes) from a single image or text! 🧵1/N Web: kovenyu.com/wonderjourney/ arxiv: arxiv.org/abs/2312.03884
English
10
116
505
109.1K
Ruizhi Shao retweetledi
A.I.Warper
A.I.Warper@AIWarper·
Here's probably one of the hardest shots I've ever attempted 1) Two characters 2) Two different race of humans (prompt bleeding) 3) Very fast paced with a lot of motion Was very fun and I learned a lot. Also develop a bunch of new workflow #StableDiffusion #Matrix #Pixar
English
894
489
4.4K
8.6M
Ruizhi Shao retweetledi
AK
AK@_akhaliq·
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians paper page: huggingface.co/papers/2312.03… Creating high-fidelity 3D head avatars has always been a research hotspot, but there remains a great challenge under lightweight sparse view setups. In this paper, we propose Gaussian Head Avatar represented by controllable 3D Gaussians for high-fidelity head avatar modeling. We optimize the neutral 3D Gaussians and a fully learned MLP-based deformation field to capture complex expressions. The two parts benefit each other, thereby our method can model fine-grained dynamic details while ensuring expression accuracy. Furthermore, we devise a well-designed geometry-guided initialization strategy based on implicit SDF and Deep Marching Tetrahedra for the stability and convergence of the training procedure. Experiments show our approach outperforms other state-of-the-art sparse-view methods, achieving ultra high-fidelity rendering quality at 2K resolution even under exaggerated expressions.
English
5
117
512
65.8K