
Sayeed Shafayet Chowdhury
83 posts

Sayeed Shafayet Chowdhury
@SayeedShafayet
Incoming Assistant Professor, Computer Science, Indiana University Indianapolis, Luddy School of Informatics






The famous "Chihuahua or Muffin" problem in computer vision is considered solved by GPT-4V on social media. But really? The answer is NO. GPT-4V cannot reason well about the same images in the original "Chihuahua or Muffin" grid when they are in a different layout. I experimented by rearranging the same images from the classic 4x4 grid into a different layout. First, GPT-4V does not directly recognize the content in details and miscounts the number of images. Then, when being asked about the third image on the top row, GPT-4V misrecognizes a Chihuahua as a muffin. So the "Chihuahua or Muffin" has not been solved yet. But how can GPT-4V work so well on the original image? My guess is that since that image is everywhere, GPT-4V was very likely to be trained on it and memorize its labels.



Today's TODO List: 1. Wait for ICCV Rejection. 2. Tweet about it. 3. Get ICCV Rejection. #ICCV2023







WACV 2023 awards. #WACV2023 @wacv_official











The Summer@EPFL 2023 application site is now open! 🎊 To apply, please visit the Summer@EPFL website: summer.epfl.ch. The application deadline for all students is the Sunday closest to the 1st of December (anywhere on earth).












