Ruojin Cai

18 posts

Ruojin Cai

@ruojin8

PhD student at Cornell CS

Katılım Nisan 2021

137 Takip Edilen309 Takipçiler

Ruojin Cai@ruojin8·23 Ara

Project page: inter-pose.github.io Paper: arxiv.org/abs/2412.16155 Great thanks to the amazing team Jason Y. Zhang (@jasonyzhang2), Philipp Henzler (@philipphenzler), Zhengqi Li (@zhengqi_li), Noah Snavely (@Jimantha), Ricardo Martin-Brualla (@rmbrualla).

English

627

Ruojin Cai@ruojin8·23 Ara

This also applies to MASt3R. While MASt3R excels with overlapping pairs via feature matching, it struggles with non-overlapping ones due to unreliable correspondences. InterPose maintains robustness, outperforming MASt3R on outward-facing and matching it on center-facing datasets

English

719

Ruojin Cai@ruojin8·23 Ara

🤔Can Generative Video Models Help Pose Estimation? ✅Yes! We find that generative video models can hallucinate plausible intermediate frames that provide useful context for pose estimators (e.g. DUSt3R), especially for images with little to no overlap. 🔗 inter-pose.github.io

GIF

English

220

31K

Ruojin Cai retweetledi

Yuanbo Xiangli@ambie_kk·11 Ara

Introducing Doppelgangers++! 🚀 An enhanced pairwise image classifier that tackles visual aliasing (doppelgangers) to improve 3D reconstruction accuracy across diverse, real-world scenes. 🌍✨ 🔗Project page: bit.ly/3VAPMJc. Code is also available.

English

106

9.3K

Ruojin Cai retweetledi

Dmytro Mishkin 🇺🇦@ducha_aiki·10 Ara

Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features Yuanbo Xiangli, Ruojin Cai, Hanyu Chen, Jeffrey Byrne, @Jimantha tl;dr: new dataset (55K pairs) + Mast3r == PROFIT arxiv.org/abs/2412.05826

English

5.5K

Ruojin Cai retweetledi

Dmytro Mishkin 🇺🇦@ducha_aiki·18 Kas

Extreme Rotation Estimation in the Wild Hana Bezalel, Dotan Ankri, @ruojin8 @ElorHadar tl;dr: MegaDepth/Scenes subset with small/large/no overlap image pairs, the task is R prediction arxiv.org/abs/2411.07096

English

4.2K

Ruojin Cai retweetledi

Gene Chou@gene_ch0u·20 Haz

Introducing MegaScenes—a scene-level dataset containing 100K SfM reconstructions and 2M images with open content licenses. We validate its effectiveness in training large-scale, generalizable models on the task of novel view synthesis. 1/N project page: megascenes.github.io

English

197

20.8K

Ruojin Cai@ruojin8·8 Eyl

Our code and data are available at: github.com/RuojinCai/Dopp… Paper: arxiv.org/abs/2309.02420 Thanks to my collaborators Joseph Tung (@jt_tung), Qianqian Wang (@QianqianWang5), Hadar Averbuch-Elor (@ElorHadar), Bharath Hariharan (@BharathHarihar3) and Noah Snavely (@Jimantha).

English

509

Ruojin Cai@ruojin8·8 Eyl

Our trained classifier works remarkably well, and can be used to filter out incorrect pairs after the COLMAP matching stage, helping COLMAP to produce correct reconstructions.

English

502

Ruojin Cai@ruojin8·8 Eyl

Check out our #ICCV203 paper called Doppelgangers. We train a classifier to detect distinct but visually similar image pairs ("doppelgangers") and apply it to SfM disambiguation, enabling COLMAP to create correct 3D models in hard cases. Project page: doppelgangers-3d.github.io

AK@_akhaliq

Doppelgangers: Learning to Disambiguate Images of Similar Structures paper page: huggingface.co/papers/2309.02… We consider the visual disambiguation task of determining whether a pair of visually similar images depict the same or distinct 3D surfaces (e.g., the same or opposite sides of a symmetric building). Illusory image matches, where two images observe distinct but visually similar 3D surfaces, can be challenging for humans to differentiate, and can also lead 3D reconstruction algorithms to produce erroneous results. We propose a learning-based approach to visual disambiguation, formulating it as a binary classification task on image pairs. To that end, we introduce a new dataset for this problem, Doppelgangers, which includes image pairs of similar structures with ground truth labels. We also design a network architecture that takes the spatial distribution of local keypoints and matches as input, allowing for better reasoning about both local and global cues. Our evaluation shows that our method can distinguish illusory matches in difficult cases, and can be integrated into SfM pipelines to produce correct, disambiguated 3D reconstructions.

English

187

67.6K

Ruojin Cai@ruojin8·30 Nis

Can we estimate relative rotations between *non-overlapping* images? Yes! Our new #CVPR2021 paper uses 4D correlation volumes to reason about implicit cues that reveal their geometric relationship. With @BharathHarihar3 @Jimantha @elorhadar. Project page: ruojincai.github.io/ExtremeRotatio…

English

Keşfet

@jasonyzhang2 @philipphenzler @zhengqi_li @Jimantha @rmbrualla @ElorHadar @jt_tung @QianqianWang5