Dídac Surís

213 posts

Dídac Surís

@Surisdi

Research Scientist @AIatMeta. Previously a Computer Vision PhD student at @Columbia. Amateur guitarist. Tweets in Catalan, Spanish or English

Katılım Ağustos 2010

514 Takip Edilen408 Takipçiler

Dídac Surís retweetledi

Kate Saenko@kate_saenko_·11 Nis

Excited to share SA-FARI which will be presented as an oral at CVPR 26! conservationxlabs.com/sa-fari My team at Meta collaborated with ConservationX Labs to create the largest open video dataset for wildlife detection -- with @Surisdi @wrong_whp @YuanTingHu1

English

100

5.9K

Dídac Surís retweetledi

Nicolas Carion@alcinos26·20 Kas

Happy and proud to release SAM3, our new segmentation model. What's new? It's now a fully fledged open vocabulary detector, capable of finding any object given a simple text prompt or an example. And we cooked hard to bring you SAM's signature "it just works" feel. A 🧵 1/x

English

10.4K

Dídac Surís retweetledi

ege ozguroglu@EgeOzguroglu·19 Haz

At #CVPR2024: we present pix2gestalt, which synthesizes whole objects from occluded ones, enabling zero-shot amodal segmentation, recognition, and 3D reconstruction! Project Page: gestalt.cs.columbia.edu Code: github.com/cvlab-columbia… arXiv: arxiv.org/abs/2401.14398

Peyman Milanfar@docmilanfar

the hardest problem in computer vision? occlusion - it's always occlusion

English

172

28.8K

Dídac Surís retweetledi

Mia Chiquier@mia_chiquier·29 Nis

Multimodal pre-trained models, such as CLIP, are popular for zero-shot classification due to their open-vocabulary flexibility and high performance, but how would you classify images that don’t have obvious names using CLIP?

GIF

English

12.9K

Dídac Surís retweetledi

Ruoshi Liu@ruoshi_liu·26 Oca

Thanks @_akhaliq for tweeting our work! It has been shown in our prior work Zero123 that Stable Diffusion has learned powerful visual priors that can be serve as the foundation of zero-shot generalization ability for many vision tasks. In our recent work pix2gestalt, we show that Stable Diffusion, when finetuned on the task of amodal segmentation, performs incredibly well on data far outside of training distribution. We've demonstrated that this model can serves as a unified solution for occlusion reasoning, benefiting many other tasks whose performance is greatly hindered by occlusion in images such as recognition, novel view synthesis, 3D reconstruction etc. Work led by Ege Ozguroglu who is currently applying for PhD!

AK@_akhaliq

pix2gestalt: Amodal Segmentation by Synthesizing Wholes paper page: huggingface.co/papers/2401.14… synthesizes whole objects from only partially visible ones, enabling amodal segmentation, recognition, and 3D reconstruction of occluded objects

English

17K

Dídac Surís retweetledi

Sachit Menon@SachitMenon·5 Eki

Come talk to me and @Surisdi about ViperGPT at our poster today at 2:30 (Foyer Sud) or our talk at 4:30 (Paris Sud) in person at #ICCV2023!

AK@_akhaliq

ViperGPT: Visual Inference via Python Execution for Reasoning abs: arxiv.org/abs/2303.08128 project page: viper.cs.columbia.edu

English

14.8K

Dídac Surís@Surisdi·5 Eki

This afternoon we will present our ViperGPT🐍 paper at #ICCV2023. If you’re in Paris, come talk to us! Oral presentation in room "Paris Sud" and poster 170 in room "Foyer Sud" Project page: viper.cs.columbia.edu (work with @SachitMenon and @cvondrick)

English

4.8K

Dídac Surís retweetledi

Joan Serrà@serrjoa·29 Tem

Join us for an internship this winter to work with @santty128 and me in #Barcelona: jobs.dolby.com/careers/job/17…

English

8.1K

Dídac Surís retweetledi

Purva Tendulkar@PurvaTendulkar·4 Mar

Excited to share our CVPR'23 paper -- FLEX🤸 We synthesize full-body 3D avatars grasping everyday objects in household scenes without requiring full-body grasping data. Website: flex.cs.columbia.edu Paper: arxiv.org/abs/2211.11903 Code: github.com/purvaten/FLEX 🧵(1/4)

English

381

59.2K

Dídac Surís retweetledi

Justin Salamon@justin_salamon·22 Haz

Excited to finally share this project! We train a model to match music to video based on its contents and style 🎞️➡️🎵 Here are examples of matching music to video shot on mobile phones 📱 Led by @Surisdi w/ @cvondrick & Bryan Russell #CVPR2022 Let's see more results 🧵(1/n)

Dídac Surís@Surisdi

Do you have some home videos you’d like to add music to? Tomorrow at #CVPR2022 we present “It’s Time for Artistic Correspondence in Music and Video”! video: youtu.be/A4g30USxI0Q website and paper: musicforvideo.cs.columbia.edu w/ @cvondrick, Bryan Russell, @justin_salamon

English

Dídac Surís@Surisdi·21 Haz

YouTube

English

Dídac Surís@Surisdi·2 Haz

@dancasas @ctocevents @CVPR @Michael_J_Black Thanks for the answer @dancasas! Yes I saw it, I think they added the option today.

English

Dan Casas@dancasas·2 Haz

@Surisdi @ctocevents @CVPR @Michael_J_Black The website managing the uploads (Conference Harvester) allows you to upload separate closed caption file (.srt in a zip file). I just did it.

English

Michael Black@Michael_J_Black·30 May

One more @CVPR question: We are asked to include captions in our videos. Is this also true for oral presentations? Oral videos are not supposed to include the speaker thumbnail so I wondered if they are also not supposed to be captioned.

English

Dídac Surís@Surisdi·1 Haz

@ctocevents @CVPR @Michael_J_Black Is this also expected for oral presentations? Is there a way of adding captions as a separate file, so that they can be turned on/off for virtual/oral? (e.g. *.vtt files). Thanks!

English

Nicole Finn@ctocevents·30 May

@CVPR @Michael_J_Black Any video submitted on the virtual site needs to have captions.

English

Dídac Surís retweetledi

ColumbiaCompSci@ColumbiaCompSci·8 Mar

Didac Suris (@Surisdi), one of our PhD students, won a Microsoft Research Fellowship (@MSFTResearch)! Learn more about him and his PhD experience here - bit.ly/PhDDidacS

English

Dídac Surís retweetledi

Pascal Mettes@PascalMettes·17 Eki

Want to know what the future of video research will be? Join us at the #ICCV2021 workshop on Structured Representations for Video Understanding. We end with a bang: a panel with Josef Sivic and Deva Ramanan. A must watch! We start at 15:00 CEST (9:00 local), panel at 21:30 CEST

English

Dídac Surís retweetledi

Hazel Doughty@doughty_hazel·23 Ağu

Working on how to best represent video? Still time to submit to the #ICCV2021 workshop on Structured Representations for Video Understanding. We're inviting submissions of either recently published or unpublished works. Deadline: Aug 27th Details: sites.google.com/view/srvu-iccv…

Pascal Mettes@PascalMettes

It’s time to discuss: what is the best structure for representing videos and what is the way forward in video understanding? We are eager to hear your views at our #ICCV2021 workshop on Structured Representations for Video Understanding Submission: Aug 27 sites.google.com/view/srvu-iccv…

English

Dídac Surís retweetledi

Pascal Mettes@PascalMettes·20 Tem

English

Dídac Surís retweetledi

Pascal Mettes@PascalMettes·20 Tem

We will have a full day with keynotes and accepted oral/poster presentations. We accept submissions for unpublished work and work published at a recent conference/journal (incl. ICCV’21) Organized with: @cvondrick @Surisdi @doughty_hazel @MikeShou1 Shih-Fu Chang @CordeliaSchmid

English

Dídac Surís@Surisdi·6 Mar

@mayfer @cvondrick Thanks for you interest! No, we don't manually tag abstract predictions as closer to the center. This is learned by the model and it is a natural result of using hyperbolic geometry.

English

murat 🍥@mayfer·5 Mar

@Surisdi @cvondrick thanks for the presentation & can't wait to see more about your work. i do have one question. in your video prediction use case, was the hierarchical classification supervised? (manually tagging more abstract predictions as closer to center?)

English

murat 🍥@mayfer·5 Mar

seems like neural network classifiers will do so much better in hyperbolic instead of linear space a linear average ends up somewhere between two points, boring. a hyperbolic average ends up closer to the center - more generalized, more abstract youtube.com/watch?v=-Uy92j…

YouTube

English

Dídac Surís retweetledi

Carl Vondrick@cvondrick·2 Mar

The future is hard to anticipate! In our latest #CVPR2021 paper, we introduce a framework for learning *what* is predictable in the future. Rather than committing up front to categories to predict, our approach learns how to hedge the bet. hyperfuture.cs.columbia.edu

English

269

Keşfet

@wrong_whp @YuanTingHu1 @_akhaliq @SachitMenon @cvondrick @santty128 @justin_salamon @dancasas