Alexis Marouani

8 posts

Alexis Marouani

Alexis Marouani

@Alexis1097657

PhD Student in DINO Team | Meta FAIR

Paris Katılım Temmuz 2023
12 Takip Edilen18 Takipçiler
Alexis Marouani
Alexis Marouani@Alexis1097657·
6/ Why does this matter? Standard ViTs often trade off classification for segmentation quality. By specializing the layers, we get the best of both worlds. It makes ViTs much more effective for "dense" tasks like object detection and depth estimation.
English
1
0
3
236
Alexis Marouani
Alexis Marouani@Alexis1097657·
#ICLR2026 Frictions in Vision Transformers 1/ ViTs use a [CLS] for global understanding and patch tokens for local details. Despite their different roles, we've been processing them with the exact same math. Looking forward for discussions ! Sat 25 10:30 AM – 1 PM P4 -#3303
English
1
6
31
2.1K