
🧵 (6/N) We are witnessing a paradigm shift: Generative vision pretraining plays a foundational role similar to LLM pretraining. Image generation can serve as a universal interface for vision tasks, mirroring the role of text generation in language understanding and reasoning.











