Daan de Geus

68 posts

Daan de Geus

Daan de Geus

@dcdegeus

Assistant professor @TUEindhoven | Prev. visiting @RWTHVisionLab | Computer vision

Katılım Haziran 2016
400 Takip Edilen188 Takipçiler
Daan de Geus retweetledi
Niccolò Cavagnero
Niccolò Cavagnero@neikos00·
🚀 New CVPR Workshop paper: Plain Mask Transformer (PMT) Finetuning VFMs for segmentation breaks their key advantage: using a shared, frozen encoder for multiple tasks. PMT: a fast Mask Transformer for frozen VFM features. 📄 arxiv.org/abs/2603.25398 💻 github.com/tue-mps/pmt
Niccolò Cavagnero tweet media
English
1
4
19
753
Daan de Geus retweetledi
Tommie Kerssies
Tommie Kerssies@tommiekerssies·
Video segmentation methods have become increasingly complex. In our #CVPR2026 paper we show once again that a surprisingly simple encoder-only architecture is actually sufficient, boosting speed up to 10x!🚀 📄Paper: arxiv.org/abs/2602.17807 💻Code: github.com/tue-mps/videomt (1/5)
English
6
94
788
37.3K
Daan de Geus retweetledi
Niels Rogge
Niels Rogge@NielsRogge·
Opus 4.6 made a @Gradio demo for it too! It uses a "chunked window" approach, allowing it to run up to 160 frames per second (FPS) I'm really impressed by coding agents - porting a model + demo in less than 2 days
Niels Rogge@NielsRogge

I tried Codex 5.3 (web) for porting VidEoMT, a simple and elegant ViT-based video segmentation model, to @huggingface Transformers Sadly, it missed the global picture, mistakenly assuming the model uses DINOv3 as its backbone, whereas it actually uses DINOv2. It got stuck. Opus 4.6 fixed it after I told it The job of ML Engineer is still safe - humans stay in the driver's seat PR: github.com/huggingface/tr…

English
1
17
127
16.6K
Daan de Geus retweetledi
Niccolò Cavagnero
Niccolò Cavagnero@neikos00·
🔥 EoMT now supports DINOv3! 🔥 ✨ Upgrading DINOv2 to DINOv3 consistently improves image segmentation performance ⚡ Lightning-fast inference with EoMT's minimalist design 📦 Model weights and code available! 🔗 github.com/tue-mps/eomt
Niccolò Cavagnero tweet media
English
1
20
85
4.9K
Daan de Geus retweetledi
Lucas Beyer (bl16)
Lucas Beyer (bl16)@giffmana·
I like the Encoder-only Mask Transformer (EoMT): basically removing all the bells and whistles, and doing panoptic segmentation with an almost vanilla ViT. You're sliiiiightly worse for the same encoder size, but it's a lot simpler/faster and (likely) more scalable. I wish they had peak gpu memory to that table though.
Lucas Beyer (bl16) tweet mediaLucas Beyer (bl16) tweet media
Niels Rogge@NielsRogge

New model alert in Transformers: EoMT! EoMT greatly simplifies the design of ViTs for image segmentation 🙌 Unlike Mask2Former and OneFormer which add complex modules like an adapter, pixel decoder and Transformer decoder on top, EoMT is just a ViT with a set of query tokens ✅

English
15
56
507
82.2K
Daan de Geus retweetledi
Niels Rogge
Niels Rogge@NielsRogge·
New model alert in Transformers: EoMT! EoMT greatly simplifies the design of ViTs for image segmentation 🙌 Unlike Mask2Former and OneFormer which add complex modules like an adapter, pixel decoder and Transformer decoder on top, EoMT is just a ViT with a set of query tokens ✅
Niels Rogge tweet media
English
4
50
346
115.1K
Daan de Geus retweetledi
Tommie Kerssies
Tommie Kerssies@tommiekerssies·
🚨 CVPR Highlight Alert! 🚨 We’re presenting our Encoder-only Mask Transformer (EoMT) tomorrow at #CVPR2025, 10:30–12:30, Poster #407! 🎸 👉 github.com/tue-mps/eomt ➕ Bonus: we're releasing the biggest EoMT yet… (1/2)
Tommie Kerssies tweet media
Tommie Kerssies@tommiekerssies

Image segmentation doesn’t have to be rocket science. 🚀 Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? 💡 That’s what we did for segmentation. ✅ Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025) (1/6)

English
1
11
61
4.9K
Daan de Geus retweetledi
Kadir Yilmaz
Kadir Yilmaz@KadirYilmaz_CV·
I'll be presenting "DINO in the Room (DITR)", the winning method of the ScanNet++ 3D semantic segmentation challenge, tomorrow at CVPR at 10 a.m. in Room 211. Project page: visualcomputinginstitute.github.io/DITR/
Kadir Yilmaz tweet media
English
1
17
92
5.9K
Daan de Geus retweetledi
Tommie Kerssies
Tommie Kerssies@tommiekerssies·
Image segmentation doesn’t have to be rocket science. 🚀 Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? 💡 That’s what we did for segmentation. ✅ Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025) (1/6)
Tommie Kerssies tweet media
English
10
80
434
34K
Daan de Geus retweetledi
Tuan-Hung VU
Tuan-Hung VU@tuan_hung_vu·
The BRAVO Challenge 2014 attracted nearly 100 submissions from international teams representing notable research institutions. The results reveal valuable insights in developing reliable semantic segmentation models. #ECCV2024 #UNCVWorkshop arxiv.org/abs/2409.15107
English
1
1
5
2.1K
Daan de Geus retweetledi
AK
AK@_akhaliq·
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think discuss: huggingface.co/papers/2409.11… Recent work showed that large diffusion models can be reused as highly precise monocular depth estimators by casting depth estimation as an image-conditional image generation task. While the proposed model achieved state-of-the-art results, high computational demands due to multi-step inference limited its use in many scenarios. In this paper, we show that the perceived inefficiency was caused by a flaw in the inference pipeline that has so far gone unnoticed. The fixed model performs comparably to the best previously reported configuration while being more than 200times faster. To optimize for downstream task performance, we perform end-to-end fine-tuning on top of the single-step model with task-specific losses and get a deterministic model that outperforms all other diffusion-based depth and normal estimation models on common zero-shot benchmarks. We surprisingly find that this fine-tuning protocol also works directly on Stable Diffusion and achieves comparable performance to current state-of-the-art diffusion-based depth and normal estimation models, calling into question some of the conclusions drawn from prior works.
AK tweet media
English
2
78
391
31.5K
Daan de Geus retweetledi
Karim Knaebel
Karim Knaebel@karimknaebel·
Check out our work on fine-tuning of image-conditional diffusion models for depth and normal estimation. Widely used diffusion models can be improved with single-step inference and task-specific fine-tuning, allowing us to gain better accuracy while being 200x faster!⚡ 🧵(1/6)
Karim Knaebel tweet media
English
5
50
271
41.2K
Daan de Geus
Daan de Geus@dcdegeus·
TAPPS: ➡️ Uses a Mask Transformer ➡️ Uses shared queries that jointly represent objects and their parts ➡️ Segments objects and parts jointly ➡️ Is directly optimized for the PPS task, instead of surrogate subtasks like existing approaches ✅ Achieves SOTA accuracy [2/3]
English
1
0
2
171
Daan de Geus
Daan de Geus@dcdegeus·
Want to identify and segment all foreground objects, background regions AND object parts in an image, in a consistent manner? Look no further! Tomorrow at #CVPR2024 we’ll present TAPPS, a unified network for the part-aware panoptic segmentation (PPS) task. [1/3]
Daan de Geus tweet mediaDaan de Geus tweet media
English
1
0
9
463