Prithiv Sakthi

7.4K posts

Prithiv Sakthi

@prithivMLmods

Computer Vision • Multimodal AI • @huggingface Fellow ML🤗 • Computational Intelligence • Diffusion-Driven Adapters • https://t.co/CZfzd6KVRA

India Katılım Ekim 2022

765 Takip Edilen524 Takipçiler

Sabitlenmiş Tweet

Prithiv Sakthi@prithivMLmods·2d

Introducing QIE-Bbox-Studio! 🔥🤗 The QIE-Bbox-Studio demo is now live: more precise and packed with powerful new features. You can manipulate images with ease: remove objects, add designs, and even move elements from one place to another, all in a fast 4-step inference process.

English

266

Prithiv Sakthi@prithivMLmods·1h

🤣🤣

Rock@Rock4754

@Kimi_Moonshot @cursor_ai Hahaha

ART

Prithiv Sakthi@prithivMLmods·13h

Map-Anything v1 (Universal Feed-Forward Metric 3D Reconstruction) demo is now available on Hugging Face Spaces. Built with @Gradio and integrated with @rerundotio , it performs multi-image and video-based 3D reconstruction, depth, normal map, and interactive measurements.

English

5.1K

Prithiv Sakthi retweetledi

Prithiv Sakthi@prithivMLmods·13h

@huggingface Space: huggingface.co/spaces/prithiv…

English

205

Prithiv Sakthi@prithivMLmods·13h

arxiv: arxiv.org/abs/2509.13414

Català

114

Prithiv Sakthi@prithivMLmods·13h

facebook/map-anything-v1: huggingface.co/facebook/map-a…

English

155

Prithiv Sakthi@prithivMLmods·1d

@elsleightholm Congratulations Ellie +/- 💐

Français

Ellie Sleightholm@elsleightholm·1d

My mathematics channel just reached 200k on Instagram 🥹

English

135

54.7K

Prithiv Sakthi@prithivMLmods·2d

@mervenoyann They also have a demo for that. huggingface.co/spaces/allenai…

English

Prithiv Sakthi@prithivMLmods·2d

@mervenoyann I’m testing an general-use SOTA for video point tracking.

English

merve@mervenoyann·2d

AI2 released new family of vision LMs for pointing (SOTA!) 🔥 > MolmoPoint-8B (general use) > MolmoPoint-GUI-8B (graphical computer use) > MolmoPoint-Vid-4B (counting/tracking in videos) also with their datasets 🥵

Ai2@allen_ai

Grounding lets vision-language models do more than describe—they can point to where a robot should grasp, which button to click, or which object to track across video frames. Today we're releasing MolmoPoint, a better way for models to point. 🧵

English

9.4K

Prithiv Sakthi retweetledi

Ai2@allen_ai·2d

VLMs already have visual tokens. Letting them point by selecting those tokens turns out to be simpler, faster, & better. 🤖 Models: huggingface.co/collections/al… 📦 Data: huggingface.co/collections/al… 💻 Code: github.com/allenai/molmo2 📖 Blog: allenai.org/blog/molmopoint

English

13.3K

Prithiv Sakthi retweetledi

elie@eliebakouch·2d

new 1T+ parameter model from @XiaomiMiMo, support 1M context length thanks to 7:1 hybrid sliding window attention!!

English

8.4K

Prithiv Sakthi@prithivMLmods·2d

Models: huggingface.co/collections/pr…

Català

121

Prithiv Sakthi@prithivMLmods·2d

Try the QIE-Bbox-Studio @Gradio app on @huggingface Spaces. Demo: huggingface.co/spaces/prithiv…

English

Prithiv Sakthi@prithivMLmods·2d

English

266

Prithiv Sakthi retweetledi

Niels Rogge@NielsRogge·2d

Introducing the Paper Pages skill! Simply paste this SKILL.md, so your coding agent knows how to work with @huggingface papers Ask it to summarize papers, search papers, or list linked models or datasets

English

162

15.9K

Keşfet

@Gradio @rerundotio @huggingface @elsleightholm @mervenoyann @XiaomiMiMo @elonmusk @BarackObama