Diane Larlus: "Our #ECCV2020 paper is out arxiv.org/abs/2008.01392 By learning to predict a mas"

Post

Our #ECCV2020 paper is out arxiv.org/abs/2008.01392 By learning to predict a masked token in a caption using its associated image, our model learns generic visual representations from scratch using only image-caption pairs. w/ @mbsariyildiz @perezjln @naverlabseurope

GIF

English

144

Nrupatunga@nrupatunga1987·6 Ağu

@dlarlus @mbsariyildiz @perezjln @naverlabseurope Just instantly curious, will read the paper, but Can I say for any new task, if I just have a caption as a GT label for that image, I could train without having to mark the bounding boxes, if my intention is to just to get approximate location

English

Farhad Nooralahzadeh, PhD@farhadnz·6 Ağu

@dlarlus @mbsariyildiz @perezjln @naverlabseurope How do you position your paper among other similar papers like VisualBert, pixelBert, virtex ...

English

Paylaş