
Arijit Ray
79 posts

Arijit Ray
@ARRay693
AI PhD Student, w/ Profs @kate_saenko_, @RanjayKrishna | Teaching machines to help humans in the digital and physical world | Prev @Google, @AIatMeta







Today we’re excited to unveil a new generation of Segment Anything Models: 1️⃣ SAM 3 enables detecting, segmenting and tracking of objects across images and videos, now with short text phrases and exemplar prompts. 🔗 Learn more about SAM 3: go.meta.me/591040 2️⃣ SAM 3D brings the model collection into the 3rd dimension to enable precise reconstruction of 3D objects and people from a single 2D image. 🔗 Learn more about SAM 3D: go.meta.me/305985 These models offer innovative capabilities and unique tools for developers and researchers to create, experiment and uplevel media workflows.


🌶️ hot take 🌶️ > we should normalize training on the test set yes, you read that right. no, I'm not joking. and, yes... I have taken ML 101 👉 here's why this is crucial for future multimodal LLM research [1/n] 🧵

MLLMs are great at understanding videos, but struggle with spatial reasoning—like estimating distances or tracking objects across time. the bottleneck? getting precise 3D spatial annotations on real videos is expensive and error-prone. introducing SIMS-V 🤖 [1/n]

Introducing Cambrian-S it’s a position, a dataset, a benchmark, and a model but above all, it represents our first steps toward exploring spatial supersensing in video. 🧶



