ML2Grow ретвитнул

Reading @MetaAI's Segment-Anything, and I believe today is one of the "GPT-3 moments" in computer vision. It has learned the *general* concept of what an "object" is, even for unknown objects, unfamiliar scenes (e.g. underwater & cell microscopy), and ambiguous cases.
I still can't believe both the model and data (11M images, 1B masks) are OPEN-sourced. Wow.😮
What's the secret sauce? Just follow the foundation model mindset:
1. A very simple but scalable architecture that takes multimodal prompts: text, key points, bounding boxes.
2. Intuitive human annotation pipeline that goes hand-in-hand with the model design.
3. A data flywheel that allows the model to bootstrap itself to tons of unlabeled images.
IMHO, Segment-Anything has done everything right.
English










