Daan de Geus (@dcdegeus) - Twitter Profili | Zamantika Mersobahis Locabet

Daan de Geus retweetledi

🚀 New CVPR Workshop paper: Plain Mask Transformer (PMT) Finetuning VFMs for segmentation breaks their key advantage: using a shared, frozen encoder for multiple tasks. PMT: a fast Mask Transformer for frozen VFM features. 📄 arxiv.org/abs/2603.25398 💻 github.com/tue-mps/pmt

English

1

4

19

753

Daan de Geus retweetledi

Tommie Kerssies@tommiekerssies·28 Şub

Video segmentation methods have become increasingly complex. In our #CVPR2026 paper we show once again that a surprisingly simple encoder-only architecture is actually sufficient, boosting speed up to 10x!🚀 📄Paper: arxiv.org/abs/2602.17807 💻Code: github.com/tue-mps/videomt (1/5)

English

6

94

788

37.3K

Daan de Geus retweetledi

Niels Rogge@NielsRogge·26 Şub

Opus 4.6 made a @Gradio demo for it too! It uses a "chunked window" approach, allowing it to run up to 160 frames per second (FPS) I'm really impressed by coding agents - porting a model + demo in less than 2 days

Niels Rogge@NielsRogge

I tried Codex 5.3 (web) for porting VidEoMT, a simple and elegant ViT-based video segmentation model, to @huggingface Transformers Sadly, it missed the global picture, mistakenly assuming the model uses DINOv3 as its backbone, whereas it actually uses DINOv2. It got stuck. Opus 4.6 fixed it after I told it The job of ML Engineer is still safe - humans stay in the driver's seat PR: github.com/huggingface/tr…

English

1

17

127

16.6K

Daan de Geus@dcdegeus·17 Eyl

🔥Encoder-only Mask Transformer (EoMT) now also supports DINOv3! ⚡️ Upgrading DINOv2 to DINOv3 ➡️ more accurate image segmentation, same lightning-fast inference 🔗 Code and models: github.com/tue-mps/eomt

AI at Meta@AIatMeta

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks. Learn more about DINOv3 here: ai.meta.com/blog/dinov3-se…

English

0

3

19

1.8K

Daan de Geus retweetledi

Niccolò Cavagnero@neikos00·17 Eyl

🔥 EoMT now supports DINOv3! 🔥 ✨ Upgrading DINOv2 to DINOv3 consistently improves image segmentation performance ⚡ Lightning-fast inference with EoMT's minimalist design 📦 Model weights and code available! 🔗 github.com/tue-mps/eomt

English

1

20

85

4.9K

Daan de Geus retweetledi

Lucas Beyer (bl16)@giffmana·2 Tem

I like the Encoder-only Mask Transformer (EoMT): basically removing all the bells and whistles, and doing panoptic segmentation with an almost vanilla ViT. You're sliiiiightly worse for the same encoder size, but it's a lot simpler/faster and (likely) more scalable. I wish they had peak gpu memory to that table though.

Niels Rogge@NielsRogge

New model alert in Transformers: EoMT! EoMT greatly simplifies the design of ViTs for image segmentation 🙌 Unlike Mask2Former and OneFormer which add complex modules like an adapter, pixel decoder and Transformer decoder on top, EoMT is just a ViT with a set of query tokens ✅

English

15

56

507

82.2K

Daan de Geus retweetledi

Niels Rogge@NielsRogge·2 Tem

New model alert in Transformers: EoMT! EoMT greatly simplifies the design of ViTs for image segmentation 🙌 Unlike Mask2Former and OneFormer which add complex modules like an adapter, pixel decoder and Transformer decoder on top, EoMT is just a ViT with a set of query tokens ✅

English

4

50

346

115.1K

Daan de Geus retweetledi

Kosta Derpanis@CSProfKGD·15 Haz

ZXX

2

33

5K

Daan de Geus retweetledi

Tommie Kerssies@tommiekerssies·15 Haz

🚨 CVPR Highlight Alert! 🚨 We’re presenting our Encoder-only Mask Transformer (EoMT) tomorrow at #CVPR2025, 10:30–12:30, Poster #407! 🎸 👉 github.com/tue-mps/eomt ➕ Bonus: we're releasing the biggest EoMT yet… (1/2)

Tommie Kerssies@tommiekerssies

Image segmentation doesn’t have to be rocket science. 🚀 Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? 💡 That’s what we did for segmentation. ✅ Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025) (1/6)

English

1

11

61

4.9K

Daan de Geus retweetledi

Kadir Yilmaz@KadirYilmaz_CV·12 Haz

I'll be presenting "DINO in the Room (DITR)", the winning method of the ScanNet++ 3D semantic segmentation challenge, tomorrow at CVPR at 10 a.m. in Room 211. Project page: visualcomputinginstitute.github.io/DITR/

English

1

17

92

5.9K

Daan de Geus@dcdegeus·1 Nis

Very excited about this work!

Tommie Kerssies@tommiekerssies

Image segmentation doesn’t have to be rocket science. 🚀 Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? 💡 That’s what we did for segmentation. ✅ Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025) (1/6)

English

0

6

207

Daan de Geus retweetledi

Tommie Kerssies@tommiekerssies·31 Mar

Image segmentation doesn’t have to be rocket science. 🚀 Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? 💡 That’s what we did for segmentation. ✅ Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025) (1/6)

English

10

80

434

34K

Daan de Geus retweetledi

Idil Esen Zulfikar@idilzulfikar·15 Eki

🚀Check our recent work #Interactive4D to achieve interactive #LiDAR segmentation of multiple objects on multiple scans simultaneously. Work with Ilya Fradlin, @KadirYilmaz_CV, @DoraKontog, and @BastianLeibe. 🌐Project: ilya-fradlin.github.io/Interactive4D/ 📜Paper: arxiv.org/pdf/2410.08206👇🧵

English

1

8

30

5.3K

Daan de Geus retweetledi

Tuan-Hung VU@tuan_hung_vu·27 Eyl

The BRAVO Challenge 2014 attracted nearly 100 submissions from international teams representing notable research institutions. The results reveal valuable insights in developing reliable semantic segmentation models. #ECCV2024 #UNCVWorkshop arxiv.org/abs/2409.15107

English

1

5

2.1K

Daan de Geus@dcdegeus·24 Eyl

Honored to be on this list 😊 Thanks @eccvconf organizers and ACs! And congrats to the other outstanding reviewers!

European Conference on Computer Vision #ECCV2026@eccvconf

The list of #ECCV2024 Outstanding Reviewers! Thank you for your service 🫡 eccv.ecva.net/Conferences/20…

English

0

14

598

Daan de Geus retweetledi

AK@_akhaliq·18 Eyl

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think discuss: huggingface.co/papers/2409.11… Recent work showed that large diffusion models can be reused as highly precise monocular depth estimators by casting depth estimation as an image-conditional image generation task. While the proposed model achieved state-of-the-art results, high computational demands due to multi-step inference limited its use in many scenarios. In this paper, we show that the perceived inefficiency was caused by a flaw in the inference pipeline that has so far gone unnoticed. The fixed model performs comparably to the best previously reported configuration while being more than 200times faster. To optimize for downstream task performance, we perform end-to-end fine-tuning on top of the single-step model with task-specific losses and get a deterministic model that outperforms all other diffusion-based depth and normal estimation models on common zero-shot benchmarks. We surprisingly find that this fine-tuning protocol also works directly on Stable Diffusion and achieves comparable performance to current state-of-the-art diffusion-based depth and normal estimation models, calling into question some of the conclusions drawn from prior works.

English

2

78

391

31.5K

Daan de Geus retweetledi

Karim Knaebel@karimknaebel·19 Eyl

Check out our work on fine-tuning of image-conditional diffusion models for depth and normal estimation. Widely used diffusion models can be improved with single-step inference and task-specific fine-tuning, allowing us to gain better accuracy while being 200x faster!⚡ 🧵(1/6)

English

5

50

271

41.2K

Daan de Geus@dcdegeus·19 Haz

At #CVPR2024? Check out our poster (#293) on Wednesday (tomorrow) at 10:30! Paper: arxiv.org/abs/2406.10114 Project page and code: tue-mps.github.io/tapps/ [3/3]

English

0

1

2

751

Daan de Geus@dcdegeus·19 Haz

TAPPS: ➡️ Uses a Mask Transformer ➡️ Uses shared queries that jointly represent objects and their parts ➡️ Segments objects and parts jointly ➡️ Is directly optimized for the PPS task, instead of surrogate subtasks like existing approaches ✅ Achieves SOTA accuracy [2/3]

English

1

0

2

171

Daan de Geus@dcdegeus·19 Haz

Want to identify and segment all foreground objects, background regions AND object parts in an image, in a consistent manner? Look no further! Tomorrow at #CVPR2024 we’ll present TAPPS, a unified network for the part-aware panoptic segmentation (PPS) task. [1/3]

English

1

0

9

463

Daan de Geus

Keşfet