Sabitlenmiş Tweet

Day Zero for Multi-Vector Retrieval.
Today we’re flipping the retrieval playbook: no dense model adaptation, no retrofit.
🏗️Multi-vector from scratch, powered by PyLate.
Meet ColBERT-Zero
In collaboration with @EPFL and the Swiss AI initiative, @LightOnIO pre-trained it end-to-end for late-interaction retrieval
🥇 SOTA on BEIR, <150M params
⚡ Supervised-first → distill = most of the gains for a fraction of the cost
🧠 Prompt alignment is non-negotiable to preserve peak performance through fine-tuning
Models, checkpoints, training code under Apache 2.0.
🧑🍳 Kudos to the whole team @antoine_chaffin @l_arnaboldi @AmelieTabatta @KrzakalaF !
🔗 Dive into the release: lighton.ai/lighton-blogs/…

English












