
- Implicit Inversion turns CLIP into a Decoder w/ @GladiaLab - MASS: MoErging through Adaptive Subspace Selection w/ @GladiaLab @RSTLessGroup Thanks to all our collaborators. See you in 🇧🇷
Iacopo Masi
3.3K posts

@_iAc
computer scientist, professor, researcher in computer vision (teaching machines to see), philosopher and ex-basketball player, scuba diver, human being!

- Implicit Inversion turns CLIP into a Decoder w/ @GladiaLab - MASS: MoErging through Adaptive Subspace Selection w/ @GladiaLab @RSTLessGroup Thanks to all our collaborators. See you in 🇧🇷

Andrej Karpathy on autoresearch with an untrusted pool of workers: "My designs that incorporate an untrusted pool of workers (into autoresearch) actually look a little bit like a blockchain. Instead of blocks, you have commits, and these commits can build on each other and contain changes to the code as you're improving it. The proof of work is basically doing tons of experimentation to find the commits that work." The idea that distributed & permissionless autoresearch ~= proof-of-useful-work remains a high-level intuition for now, but it is extremely intriguing to say the least. Someone needs to take this further. See QT for more on what's missing.

[1/D] 🤔 What are drifting models really connected to? 📢 Our new paper, A Unified View of Drifting and Score-Based Models, shows that the bridge to score-based models is clear and precise (w/ team and @mittu1204, @StefanoErmon, @MoleiTaoMath)! ✍️ Main takeaway: drifting is more closely connected to score-based (diffusion) modeling than it may first appear! 🔗 arxiv.org/abs/2603.07514 🎯 Here’s why: Drifting’s mean-shift moves a sample toward the kernel-weighted average of nearby samples. Score function points toward regions of higher density. So both describe local directions that push samples toward where data is denser. We show that this link is exact for Gaussian kernels (Section 4.1): 📌drifting’s mean-shift = a rescaled score-matching field between the Gaussian-smoothed data and model distributions — the vector field underlying score matching (Tweedie!). 📌This also clarifies the bridge to Distribution Matching Distillation (DMD): both use score-based transport directions, but only differ in how the score is realized—drifting does so nonparametrically through kernel neighborhoods, whereas DMD relies on a pretrained diffusion teacher. 🤔 So what happens for the default Laplace kernel used in drifting models? Let’s look below 👇

introducing Flywheel: the infrastructure for autonomous research.










🚨 New workshop at #CVPR2026: MUV — Machine Unlearning for Vision As vision models scale and move into real-world use, the ability to remove concepts or behaviors after training is becoming increasingly important. Join us: 🔗 …chine-unlearning-for-vision.github.io @CVPR


Can AI forget? 🧠❌ Join MUV at @CVPR 26 in Denver! 🏔️ Speakers from @GoogleDeepMind, @MIT_CSAIL & more. 📝 Submit by March 15! Organizers: @SapienzaRoma, @MIT, @TU_Muenchen, @_italai and MPI. Details: …chine-unlearning-for-vision.github.io #CVPR2026 #AI #ComputerVision







🚨 New workshop at #CVPR2026: MUV — Machine Unlearning for Vision As vision models scale and move into real-world use, the ability to remove concepts or behaviors after training is becoming increasingly important. Join us: 🔗 …chine-unlearning-for-vision.github.io @CVPR