LightOn

4

7

561

LightOn@LightOnIO·21h

@RasmusToivanen @IgorCarron Fair point. API is live: early access at lighton.ai/pricing On Azure Doc Intelligence, someone already ran the comparison 👀 github.com/joneswack/ocr-…

English

2

4

238

Rasmus Toivanen@RasmusToivanen·22h

@IgorCarron @LightOnIO While I say great job as European, and while impressive I would not be banging my chest on single benchmark, kinda niche thing. Get SaaS API (If you do not already) and tell you are outgrowing something like Azure Doc intelligence in EU then that would be great

English

0

1

1.8K

Igor Carron@IgorCarron·2d

Everyone told us the AI race was over. That Europe🇪🇺 missed it. That you need $10B clusters and closed-source moats to compete. Then @LightOnIO's LightOnOCR-2 -1B parameters, open-source, running on a single GPU you can put on your desk- just beat OpenAI GPT-5 mini, Anthropic Claude Sonnet, Google Gemini 2.5 Flash, Zhipu GLM-4.5V, and DeepSeek-OCR on table extraction. The work that actually matters. Not Silicon Valley 🇺🇸 Not Shenzhen🇨🇳 Not Beijing 🇨🇳 Not Hangzhou 🇨🇳 From Paris🇫🇷 ...with love 💕 The race isn't over. It never was.

Igor Carron@IgorCarron

x.com/i/article/2037…

English

21

52

424

46.3K

LightOn retweetledi

Igor Carron@IgorCarron·2d

They said you need 235 billion parameters to extract tables from PDFs. They were wrong by 234 billions.

Igor Carron@IgorCarron

x.com/i/article/2037…

English

7

71

1K

149.1K

LightOn retweetledi

Igor Carron@IgorCarron·3d

x.com/i/article/2037…

ZXX

5

31

247

203.5K

LightOn@LightOnIO·3d

@iledefrance × @LightOnIO 30% de tickets IT en moins -> 360k€ économisés par an. 15 000 à 20 000 tickets IT par mois. Une grande partie ne nécessite pas d’intervention technique, mais simplement un accès rapide à la bonne information Avec LightOn, déployé sur une infrastructure souveraine, la Région a lancé un assistant IA interne désormais utilisé par plus de 3 000 agents. Résultats : ⚙️30 % de réduction prévue des tickets IT 🔐 Déploiement entièrement souverain (on-premise) 🔗Intégration avec ServiceNow et SSO 📈Environ 360 k€ d’économies annuelles Un exemple concret d’une IA qui résout des frictions opérationnelles, au-delà de l’expérimentation technologique. Découvrez ce cas d'usage complet 👇🏻 lighton-dev.webflow.io/fr-blog-posts/…

Français

5

12

353

LightOn@LightOnIO·4d

Single-vector embeddings can't represent what they can't represent. @antoine_chaffin digs into the LIMIT benchmark saga and why late interaction models already solve it, despite originally lower reported results

x.com/i/article/2033…

English

6

24

3.1K

LightOn retweetledi

Antoine Chaffin@antoine_chaffin·4d

x.com/i/article/2033…

ZXX

24

130

31.8K

LightOn@LightOnIO·5d

🎙️ "Il faut penser l'IA comme une infrastructure ancrée dans la réalité documentaire des organisations, et non plus comme une application ex machina." @IgorCarron était l'invité de @simottel sur @bfmbusiness pour revenir sur les dernières innovations de LightOn et ce nouveau champ qu'elles ouvrent : l'intelligence documentaire. bfmtv.com/economie/repla…

Français

3

8

525

LightOn@LightOnIO·6d

To everyone who has hit the wall doing RAG: we planned this one for you. Broken retrieval. Hallucinations at inference. Pipelines that fold the moment data gets sensitive. We know where it breaks. We built Paradigm to fix it. LightOn is joining @TDSYNNEX on March 27, to show what production-grade retrieval actually looks like inside a regulated enterprise: 🔍 Hybrid search, 🧠 structured reasoning, 📋 full auditability, 🔒 zero data leaving your infra. @Gauthier_Z brings the technical depth alongside Fabrice Bagniakana. No skipping the hard parts. 📅 March 27 · 11:00–12:00 CET 🔗 Register: @7fe14ab6-8f5d-4139-84bf-cd8aed0ee6b9" target="_blank" rel="nofollow noopener">events.teams.microsoft.com/event/21406722…

English

Raphaël Sourty@raphaelsrty

3

8

504

LightOn@LightOnIO·23 Mar

LightOn bet on multi-vector early. This is pay day. When most systems were still compressing everything into a single embedding, LightOn went the other way. We built the ecosystem, open source from the ground up, and multi-vector is now winning where it counts: 🧩 Complex queries. 📚Long documents. 💻 Code. 🎯 Out-of-distribution. 🤖 Agentic systems. @AmelieTabatta and @antoine_chaffin joined @CShorten30 on the @weaviatepodcast to break down why we made this bet, what we've built, and what it unlocks for the next generation of search and reasoning. 🎧: youtu.be/44GC3E-WbHU

YouTube

English

2

11

22

1.2K

LightOn@LightOnIO·20 Mar

Days since LightOn last shipped a retrieval milestone: 0 BM25x just dropped. Don't choose between lexical, dense and multi-vector semantic retrieval. Run all three. They're cheaper, faster, better simultaneously. That's hybrid search with no compromises → lighton.ai/lighton-api

Released BM25x on @LightOnIO git this week, 13000 queries per second (QPS) on MSMARCO (8.8M documents) with 4*H100 against 19 QPS for BM25s (CPU). The comparison is not fair but let me introduce bm25x 👇

English

11

76

7.1K

LightOn retweetledi

Benjamin Warner@benjamin_warner·20 Mar

ModernBERT is the base model which keeps on delivering.

BrowseComp-Plus, perhaps the hardest popular deep research task, is now solved at nearly 90%... ... and all it took was a 150M model ✨ Thrilled to announce that Reason-ModernColBERT did it again and outperform all models (including models 54× bigger) on all metrics

English

8

48

3.6K

LightOn retweetledi

Leonie@helloiamleonie·20 Mar

Antoine and the LightOn team did it again: 150M multi-vector model > 8B single-vector model

BrowseComp-Plus, perhaps the hardest popular deep research task, is now solved at nearly 90%... ... and all it took was a 150M model ✨ Thrilled to announce that Reason-ModernColBERT did it again and outperform all models (including models 54× bigger) on all metrics

English

7

39

4.5K

LightOn retweetledi

Raymond Weitekamp@raw_works·19 Mar

currently re-embedding my entire machine, thank you very much! LateOn-Code-edge for code search and Reason-ModernColBERT for prose/docs search.

BrowseComp-Plus, perhaps the hardest popular deep research task, is now solved at nearly 90%... ... and all it took was a 150M model ✨ Thrilled to announce that Reason-ModernColBERT did it again and outperform all models (including models 54× bigger) on all metrics

English

3

8

42

5.2K

LightOn retweetledi

tomaarsen@tomaarsen·19 Mar

This is very very nice to see, multi-vector models really seem like a class of their own here!

BrowseComp-Plus, perhaps the hardest popular deep research task, is now solved at nearly 90%... ... and all it took was a 150M model ✨ Thrilled to announce that Reason-ModernColBERT did it again and outperform all models (including models 54× bigger) on all metrics

English

9

28

3K

LightOn retweetledi

Bo@bo_wangbo·19 Mar

after hacking pylate and fast-plaid for 2 days now i'm a huge fan of LightOn's work!

BrowseComp-Plus, perhaps the hardest popular deep research task, is now solved at nearly 90%... ... and all it took was a 150M model ✨ Thrilled to announce that Reason-ModernColBERT did it again and outperform all models (including models 54× bigger) on all metrics

English

3

6

31

3.1K

LightOn retweetledi

Connor Shorten@CShorten30·19 Mar

Super exciting win for Agentic Search and Late Interaction! 🧬 GPT-5 + Reason-ModernColBERT (150M) reaches ~88% accuracy with an average of ~13 search calls. For reference, when BrowseComp-Plus was published in August 2025, the max accuracy reported was ~70% using GPT-5 + Qwen3-Embed-8B, using ~22 search calls. Searching with reasoning 🤖💭is a beast. 🔥 This is a huge evangelist for semantic search and Late Interaction models are particularly shining thanks to their effectiveness at long input modeling with fine-grained similarity scores. 🛠️ Congratulations @antoine_chaffin and team! 🎉

BrowseComp-Plus, perhaps the hardest popular deep research task, is now solved at nearly 90%... ... and all it took was a 150M model ✨ Thrilled to announce that Reason-ModernColBERT did it again and outperform all models (including models 54× bigger) on all metrics

English

3

14

103

8.7K

LightOn retweetledi

Omar Khattab@lateinteraction·20 Mar

biggest AI news in 2 weeks ICYMI