Vincent Qin

1.5K posts

Vincent Qin

@AlphaRealcat

⭐️Focusing on Visual Localization, SfM and SLAM.

Sumali Mart 2022

416 Sinusundan369 Mga Tagasunod

Naka-pin na Tweet

Vincent Qin@AlphaRealcat·23 Tem

Image matching webui is now deployed on HF, link: huggingface.co/spaces/Realcat…

English

2.5K

Vincent Qin nag-retweet

Johan Edstedt @Parskatt·5d

Introducing LoMa, the next generation of feature matcher!

English

292

35.8K

Vincent Qin nag-retweet

Zhenjun Zhao@zhenjun_zhao·1 Nis

Fisheye3R: Adapting Unified 3D Feed-Forward Foundation Models to Fisheye Lenses Ruxiao Duan, Erin Hong, Dongxu Zhao, Eric Turner, Alex Wong, Yunwen Zhou tl;dr: in title arxiv.org/abs/2603.28896

Filipino

1.7K

Vincent Qin nag-retweet

Zhenjun Zhao@zhenjun_zhao·31 Mar

TerraSky3D: Multi-View Reconstructions of European Landmarks in 4K Mattia D'Urso, Yuxi Hu, Christian Sormann, Mattia Rossi, Friedrich Fraundorfer tl;dr: new 3D dataset arxiv.org/abs/2603.28287

English

1.6K

Vincent Qin@AlphaRealcat·1 Nis

project page: icetea-cv.github.io/mv-roma

Zhenjun Zhao@zhenjun_zhao

MV-RoMa: From Pairwise Matching into Multi-View Track Reconstruction Jongmin Lee, Seungyeop Kang, Sungjoo Yoo tl;dr: pairwise matches->sampling->initial feature tracks->cross-attention->refined multi-view dense matches arxiv.org/abs/2603.27542

English

165

Vincent Qin@AlphaRealcat·31 Mar

thin and clean point cloud😆

Johan Edstedt @Parskatt

SfM be like

English

Vincent Qin@AlphaRealcat·27 Mar

@gabriberton @GoogleDeepMind Congratulations, Gabriele🚀

English

342

Gabriele Berton@gabriberton·27 Mar

I have joined @GoogleDeepMind! I'll be training VLMs And I'll still keep posting about latest developments on AI, Computer Vision and LLMs So no more posts on PyTorch tricks. I might post about JAX. Stay tuned...

English

122

3.6K

145.5K

Vincent Qin@AlphaRealcat·17 Mar

@gabriberton 🚀🚀🚀Great!

English

306

Gabriele Berton@gabriberton·17 Mar

VisMatch is on pypi! VisMatch is a wrapper for image matching models, like LightGlue, RoMa-v2, MASt3R, LoFTR, and 50+ more! It's literally as simple as: pip install vismatch vismatch-match --inputs img0 img1 --matcher choose_any To run image matching on any 2 images [1/4]

English

416

50.2K

Vincent Qin nag-retweet

Zhenjun Zhao@zhenjun_zhao·10 Mar

FrameVGGT: Frame Evidence Rolling Memory for streaming VGGT Zhisong Xu, Takeshi Oishi tl;dr: not token-level compression, but block-level bounded retention arxiv.org/abs/2603.07690

English

1.8K

Vincent Qin@AlphaRealcat·10 Mar

Code: github.com/VAISR/OVGGT

Kwang Moo Yi@kwangmoo_yi

Lu et al., "OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer" Updating KV caches with the most important tokens, using a fixed budget, with some anchors to prevent drifting. Constant budget, improved speed & quality.

English

122

Vincent Qin@AlphaRealcat·9 Mar

@xiaohu MiniMax M2.5呢?

日本語

1.4K

小互@xiaohu·9 Mar

OpenClaw AI Agent 小龙虾能力排行榜专门测试各家大模型在 OpenClaw 框架下执行实际编码任务的成功率。用一套标准化的 OpenClaw Agent 任务来跑各个模型，通过自动化检查 + LLM 评审来打分，衡量每个模型完成任务的成功率。前三名分别为： Gemini 3 Flash Preview MiniMax M2.1 Kimi K2.5 然后是： Claude Sonnet 4.5 Gemini 3 Pro Preview Claude Haiku 4.5 Claude Opus 4.6 Claude 家族三个模型都在 90% 以上，GPT-5.2 反而只有 65.6% 排名靠后，DeepSeek V3.2 在 82% 左右。

中文

30.6K

Vincent Qin nag-retweet

Zhiwen(Aaron) Fan@zhiwen_fan_·5 Mar

What happens when VLMs meet 3D foundation models? See VLM-3R (CVPR 2026). VLM-3R links a vision-language model (e.g., Qwen) with 3D geometric foundation models (e.g., CUT3R) at metric scale. Given an uncalibrated video, it moves beyond pixels to perceive and reason in 3D space. Code (open source): vlm-3r.github.io

English

143

10.5K

Vincent Qin nag-retweet

Zhenjun Zhao@zhenjun_zhao·5 Mar

ZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training @Haian_Jin, Rundi Wu, @tianyuanzhang99, @RuiqiGao, @jon_barron, Noah Snavely, @holynski_ tl;dr: another(?) TTT+VGGT arxiv.org/abs/2603.04385

English

2.8K

Vincent Qin@AlphaRealcat·27 Şub

Code: github.com/weitong8591/gl…

Tong Wei@weitong8591

Excited to share that our paper "Global-Aware Edge Prioritization for Pose Graph Initialization" has been accepted to CVPR 2026! #CVPR2026 See you soon in Denver!🥳🥳Code is coming soon💻 🫰Big thanks to my amazing co-authors Giorgos Tolias & supervisors: @majti89 @matas_jiri

English

327

Vincent Qin@AlphaRealcat·27 Şub

Code: github.com/ywh187/XStream…

Zhenjun Zhao@zhenjun_zhao

XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression Zunhai Su, Weihao Ye, Hansen Feng, Keyu Fan, Jing Zhang, Dahai Yu, Zhengwu Liu, Ngai Wong tl;dr: pruning and quantization->StreamVGGT arxiv.org/abs/2602.21780

English

219

Vincent Qin nag-retweet

sasaki@engineer@rsasaki0109·26 Şub

VLG-Loc Vision-Language Global Localization (VLG-Loc) is a global localization method that uses camera images and a human-readable labeled footprint map containing only names and areas of distinctive visual landmarks github.com/CyberAgentAILa…

English

2.9K

Vincent Qin@AlphaRealcat·26 Şub

@yzly1 @zhenjun_zhao The author has withdrawn the paper.

English

nmsl❤️@yzly1·25 Şub

@zhenjun_zhao I think the results of XFeat and RIPE on MegaDepth seem unusually low. Are these the originally reported results, or were they obtained under different settings?

English

152

Vincent Qin nag-retweet

Zhenjun Zhao@zhenjun_zhao·25 Şub

From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection Yepeng Liu, Hao Li, Liwen Yang, Fangzhen Li, Xudi Ge, Yuliang Gu, kuang Gao, Bing Wang, Guang Chen, Hangjun Ye, Yongchao Xu tl;dr: multi-view version of RL-based method (RFP/RIPE) for detection; RDD as backbone no eval. on IMC arxiv.org/abs/2602.20630

Filipino

1.8K

Vincent Qin nag-retweet

Zhenjun Zhao@zhenjun_zhao·24 Şub

OpenVO: Open-World Visual Odometry with Temporal Dynamics Awareness @phucnda, @anh_n_nhu, @MingCLinCS tl;dr: temporal dynamics+scene geometry arxiv.org/abs/2602.19035

English

1.9K

Vincent Qin nag-retweet

Zhenjun Zhao@zhenjun_zhao·23 Şub

Have We Mastered Scale in Deep Monocular Visual SLAM? The ScaleMaster Dataset and Benchmark Hyoseok Ju, Bokeon Suh, @GiseopK tl;dr: in title arxiv.org/abs/2602.18174

Indonesia

2.6K

Vincent Qin nag-retweet

Yiwen Zhang@YiwenZhangYZ·23 Şub

🚀 #CVPR2026 Accepted!🚀 Thrilled to share that my first-authored undergraduate paper, “Emergent Extreme-View Geometry in 3D Foundation Models,” has been accepted to CVPR 2026! 🎉 Looking forward to seeing many of you in Denver! ✈️ Project page: ext-3dfms.github.io

English

122

5.7K

Tuklasin

@gabriberton @GoogleDeepMind @xiaohu @Haian_Jin @tianyuanzhang99 @RuiqiGao @jon_barron @holynski_