Arjun Karpur

19 posts

Arjun Karpur

@arjunkarpur

Head of ML @ Epsilon Health | Prev. @GoogleDeepMind @UTCompSci | Training image encoders & VLMs for radiology

San Francisco, CA Katılım Mayıs 2024

80 Takip Edilen61 Takipçiler

Sabitlenmiş Tweet

Arjun Karpur@arjunkarpur·31 Eki

Epsilon is hiring, come join us! We're looking for Research Scientists, Research Engineers, and Full Stack SWEs to help us scale vision encoders and VLMs for radiology🩻 Jobs: jobs.ashbyhq.com/epsilon-health Website: epsilon.health Or email me! arjun [at] epsilonlabs . ai

English

148

Arjun Karpur retweetledi

André Araujo@andrefaraujo·1d

True multimodal AI needs to understand the world spatially 🎯 🚀 Excited to release #CVPR2026 TIPSv2 from @GoogleDeepMind, a foundational image-text encoder with spatial awareness, leading to strong overall results and massive gains on patch-text alignment. 🔥 1/N

English

712

77.1K

Arjun Karpur@arjunkarpur·25 Nis

Thanks to everyone that stopped by today! Open-source code and weights for TIPS are available at: gdm-tips.github.io #ICLR2025

English

1.5K

Arjun Karpur@arjunkarpur·25 Nis

Excited to be presenting TIPS at this morning’s #ICLR2025 poster session! Come by poster #318 and say hi 👋 w/ @kfrancischen @andrefaraujo @kmaninis #ICLR #ICLR25

English

770

Arjun Karpur@arjunkarpur·19 Mar

@StanSzymanowicz Congrats on the release, Stan!

English

440

Stan Szymanowicz@StanSzymanowicz·19 Mar

⚡️ Introducing Bolt3D ⚡️ Bolt3D generates interactive 3D scenes in less than 7 seconds on a single GPU from one or more images. It features a latent diffusion model that *directly* generates 3D Gaussians of seen and unseen regions, without any test time optimization. 🧵👇 (1/9)

English

535

125.6K

Arjun Karpur retweetledi

André Araujo@andrefaraujo·18 Mar

Multimodal AI encoders often lack spatial understanding… but not anymore! Our #ICLR2025 TIPS model (Text-Image Pretraining with Spatial awareness) from @GoogleDeepMind can help 💡🚀 Check out our strong & versatile image-text encoder 💪 Paper & code: arxiv.org/abs/2410.16512

English

322

35.4K

Arjun Karpur retweetledi

Kevis-Kokitsi Maninis@kmaninis·11 Mar

📢📢 We released checkpoints and Pytorch/Jax code for TIPS: github.com/google-deepmin… Paper updated with distilled models, and more: arxiv.org/abs/2410.16512 #ICLR2025

André Araujo@andrefaraujo

Excited to release a super capable family of image-text models from our TIPS #ICLR2025 paper! github.com/google-deepmin… We have models from ViT-S to -g, with spatial awareness, suitable to many multimodal AI applications. Can’t wait to see what the community will build with them!

English

2.2K

Arjun Karpur retweetledi

André Araujo@andrefaraujo·11 Mar

André Araujo@andrefaraujo

Want some TIPS? Well, then check out “Text-Image Pretraining with Spatial awareness” :) TIPS is a general-purpose image-text encoder, for off-the-shelf dense and image-level prediction. Finally image-text pretraining with spatially-aware representations! arxiv.org/abs/2410.16512

English

3.6K

Arjun Karpur retweetledi

André Araujo@andrefaraujo·23 Eki

Great work from our team at @GoogleDeepMind! With @kmaninis, @kfrancischen, @sohamg121, @arjunkarpur, Koert Chen, Ye Xia, Bingyi Cao, @GuangxingHan, Jan Dlabal, Dan Gnanapragasam, Mojtaba Seyedhosseini, @howardzzh

550

Arjun Karpur@arjunkarpur·21 Haz

OmniGlue on display at #CVPR24 Thanks for presenting our work! @hanwenjiang1 @andrefaraujo Full paper: hwjiang1510.github.io/OmniGlue/

English

332

Arjun Karpur@arjunkarpur·26 May

@AlphaRealcat I really love the image matching webui, glad OmniGlue is now an option there. Thank you!

English

Arjun Karpur retweetledi

Vincent Qin@AlphaRealcat·24 May

Just released an ONNX version of OmniGlue. No more need for TensorFlow installations, folks! 😄 Check out the comparison between the TensorFlow and ONNX versions in the image below. @arjunkarpur Code: github.com/Vincentqyw/omn…

Dmytro Mishkin 🇺🇦@ducha_aiki

OmniGlue: Generalizable Feature Matching with Foundation Model Guidance @hanwenjiang1 , Arjun Karpur, Bingyi Cao, @qixing_huang @andrefaraujo tl;dr: use DINOv2 similarity to prune matching graphs, query and key use posenc, but attention not. No IMC eval arxiv.org/abs/2405.12979

English

2.1K

Arjun Karpur retweetledi

André Araujo@andrefaraujo·23 May

Meet #CVPR2024 OmniGlue, the first learnable matcher designed with generalization as a core principle! Great performance on many domains, ideal for in-the-wild matching 🎯 Code available! arxiv.org/abs/2405.12979 with @hanwenjiang1, @arjunkarpur, Bingyi Cao, @qixing_huang

English

129

9.5K

Arjun Karpur retweetledi

#CVPR2026@CVPR·23 May

HUGE shoutout to our #CVPR2024 Outstanding Reviewers 🫡

English

220

79.3K

Arjun Karpur@arjunkarpur·22 May

@AlphaRealcat Thanks for the tweet! Adding OmniGlue to the image matching webui is on our TODO list

English

Vincent Qin@AlphaRealcat·22 May

OmniGlue: Generalizable Feature Matching with Foundation Model Guidance github.com/google-researc…

English

653

Arjun Karpur@arjunkarpur·22 May

@qixing_huang @hanwenjiang1 Thanks for the shoutout Qixing! I’m glad we had a chance to work together again

English

Qixing Huang@qixing_huang·22 May

Hanwen's @hanwenjiang1 CVPR 24 paper is out. It is interesting that I supervised Arjun (second author) undergraduate thesis at UT, who joined Google right after having a BS degree. He has been doing some interesting 3D vision research, including a 3DV'24 oral paper (LFM-3D).

Zhenjun Zhao@zhenjun_zhao

OmniGlue: Generalizable Feature Matching with Foundation Model Guidance @hanwenjiang1, Arjun Karpur, Bingyi Cao, @qixing_huang, @andrefaraujo tl;dr: DINOv2->SG/LoFTR; DINOv2->similarities between keypoints->intra&inter-image graphs->self&cross-attention arxiv.org/pdf/2405.12979

English

Arjun Karpur retweetledi

Kevis-Kokitsi Maninis@kmaninis·16 Eki

Are you evaluating 3D reconstruction/dense correspondences on synthetic datasets because real datasets are "not accurate enough"? Check out NAVI, a dataset that offers near-perfect alignments of 3D shapes on real image collections: navidataset.github.io #NeurIPS2023 (1/2)

Vittorio Ferrari@VittoFerrariCV

Three papers accepted to #NeurIPS 3/3 NAVI: a dataset of image collections of objects, along with high-quality 3D object scans, near-perfect 2D-3D alignments, and accurate camera parameters. arxiv.org/abs/2306.09109 navidataset.github.io With @jampani_varun, @kmaninis, others

English

11K

Arjun Karpur retweetledi

Zhenjun Zhao@zhenjun_zhao·23 Mar

LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D Signals Arjun Karpur, Guilherme Perrotta, Ricardo Martin-Brualla, Howard Zhou, Andre Araujo tl;dr: normalized object coordinates/monocular depth estimates+improved PE->improve SuperGlue arxiv.org/pdf/2303.12779…

English

2.2K

Arjun Karpur retweetledi

Dmytro Mishkin 🇺🇦@ducha_aiki·15 Ağu

Global Features are All You Need for Image Retrieval and Reranking Shihao Shao, Kaifeng Chen, Arjun Karpur, Qinghua Cui, Andre Araujo, Bingyi Cao tl;dr: Regional+Scale GeM pool, search for higher p in GeM after training, ReLU+ before GeM, aQE-like rerank arxiv.org/abs/2308.06954…

English

7.1K

Keşfet

@GoogleDeepMind @kfrancischen @andrefaraujo @kmaninis @StanSzymanowicz @sohamg121 @GuangxingHan @howardzzh