Naka-pin na Tweet

Thank you for the retweet!
Our dataset, PixelProse contains descriptive and dense captions for over 16M images through the Google Gemini Vision model!
We carefully curate images from 3 different sources, filter for CSAM, and provide additional filters and metadata.
merve@mervenoyann
Forget about all the captioning datasets you've tried before! PixelProse is a captioning dataset of 16M image-caption pairs, with less toxicity and higher details ✨ huggingface.co/datasets/tomg-…
English






































