Sam

@shar_dev7

Developer | Open Source Contributor | ML Engineer Building, learning, and shipping ideas 🚀

New Delhi Katılım Mart 2026

21 Takip Edilen3 Takipçiler

Sam@shar_dev7·18 Nis

Want the full work? Paper: arxiv.org/abs/2604.11177 Code: github.com/video-db/gemin…

English

Sam@shar_dev7·18 Nis

Our paper "Do Thought Streams Matter?" got featured in a YouTube video 👀 Quick watch, simple breakdown, and some cool insights on Gemini video reasoning. Watch here: youtube.com/watch?v=8HN19n…

YouTube

English

Sam@shar_dev7·17 Nis

@alexabelonix Thanks! 🙌 glad it resonated

English

Alexa Web3 (e/acc)@alexabelonix·17 Nis

@shar_dev7 love this approach.

English

Sam@shar_dev7·17 Nis

Can video AI really reason about what it sees, or does it just sound confident? We explored this in our new paper on Gemini vision-language models for video scene understanding. Some interesting results came out of it 👀

English

Sam@shar_dev7·17 Nis

If you work on VLMs, video understanding, or multimodal reasoning, this may interest you. Paper link: arxiv.org/abs/2604.11177 Github repo : github.com/video-db/gemin…

English

Sam@shar_dev7·17 Nis

This matters because a lot of people assume more reasoning = better answers. But for video models, the story may be more nuanced. Our paper looks at this closely and helps make sense of how reasoning works in real video understanding settings.

English

Sam@shar_dev7·14 Nis

@ashu_trv Interesting perspective on reasoning efficiency in VLMs.

English

Ashu@ashu_trv·14 Nis

We just released a new benchmark looking inside the "black box" of Gemini 2.5 reasoning for video understanding. Does "thinking more" always lead to better results? The answer is more nuanced than you’d think 💭