LightOn

2.4K posts

LightOn banner
LightOn

LightOn

@LightOnIO

LightOn is a leading European generative AI company delivering secure on-prem RAG for document intelligence, enabling safe use of sensitive data behind firewall

Paris, France Katılım Ekim 2015
838 Takip Edilen4.9K Takipçiler
Sabitlenmiş Tweet
LightOn
LightOn@LightOnIO·
One API to run them all. Agents do not ask nicely. They dump raw PDFs, garbled tables, and off-domain questions into the same thread. 📄Parsing: LightOnOCR-2 turns scans, tables, handwriting, and multi-column layouts into structured output. 20+ languages natively. Send the raw file. No Unstructured, no LayoutLM, no glue code. 🧠Extraction: Pull any key-value, entity, or field you care about straight from documents and images. Define the schema, get structured JSON back. 🔍Search: Dense embeddings, sparse lexical terms, and NextPlaid late-interaction vectors, powered by LateOn, which generalizes across domains with no fine-tuning. One query hits all three signals. The index picks the signal; you do not pick the index. One API key. One dashboard. One spend cap. Three endpoints. Zero pipeline maintenance. claim your access : console.lighton.ai
LightOn tweet media
English
1
5
17
2.8K
LightOn retweetledi
LightOn
LightOn@LightOnIO·
One API to feed any model, power any agent. One Playground, three endpoints, zero config. Request Early Access!
English
4
19
266
3.7M
LightOn
LightOn@LightOnIO·
Un fournisseur dépose le bilan à 9h47. Vous recevez une alerte : « plus de 1 M€ d’exposition, call avec la direction à midi." Ce cas d'usage explore la capacité d’une IA à raisonner à partir de documents d'enterprise fragmentés. Quand “avoir à peu près raison” revient à entrer en réunion avec le mauvais chiffre. 📰 Lire l’analyse complète ici : lighton.ai/fr-blog-posts/… 💻Testez le scénario sur votre propre corpus documentaire → console.lighton.ai
LightOn tweet media
Français
0
3
5
1K
LightOn
LightOn@LightOnIO·
Reason-ModernColBERT topped BrowseComp-Plus with just 149M parameters. Now, Agent-ModernColBERT adds ~10% on top. Reaches GPT-5 + Qwen3-8B with GPT-OSS-120B. Still 149M parameters. Fully Open. Smaller. Cheaper. Kudos to @antoine_chaffin for the work 👏🏻 Full benchmarks, methodology, model, data, and training code in the blog ↓ lighton.ai/lighton-blogs/…
LightOn tweet media
English
2
41
107
11.9K
LightOn retweetledi
LightOn
LightOn@LightOnIO·
One API to feed any model, power any agent. One Playground, three endpoints, zero config.
English
10
48
229
4M
LightOn
LightOn@LightOnIO·
@LightOnIO × @Dassault3DS Some conversations signal where the market is heading. On June 9, LightOn will be part of @outscale EXPERIENCES 2026 to discuss a shared vision of enterprise AI: sovereign, secure, and built for operational deployment at scale. As AI moves into critical environments, infrastructure, governance, and execution can no longer be separated. More soon. 📍 CNIT Forest, Paris La Défense hashtag#SovereignAI hashtag#EnterpriseAI hashtag#OUTSCALEExperiences hashtag#LightOn hashtag#DassaultSystemes
LightOn tweet media
English
1
4
9
433
LightOn
LightOn@LightOnIO·
RT @IgorCarron: Tiny @LightOnIO 's LateOn on the same level as GPT-5 Query Rewriter! Onward to beating the Oracle with small and Open Sour…
English
0
2
0
105
LightOn
LightOn@LightOnIO·
Better signal. Faster inference. XTR training and WARP index are now merged into PyLate: Same late-interaction signal. Lower inference cost at scale. The Late Interaction train is not slowing down. Jump in!
Antoine Chaffin@antoine_chaffin

XTR allows to perform multi-vector retrieval faster But there is not much models and tooling around it, hindering its adoption @Robro612 did a very interesting replication study and we took the opportunity to merge XTR into PyLate, alongside the awesome XTR-WARP of @hugemensa

English
0
4
13
730
LightOn
LightOn@LightOnIO·
From deploying agents → to owning your data, models, and infrastructure. That shift is being explored today at @gosimfoundation Paris 2026, a key European event for open-source and sovereign AI. 🎤Today, @staghado took the stage for @LightOnIO : “LightOnOCR: Pushing the Performance–Efficiency Pareto Frontier of Open OCR Models” 🔍 Parsing is the first step of any AI system Before anything else, your data needs to be extracted, structured, and usable. 👉🏻 Discover LightOn orchestrated pipeline, from parsing to grounded answers: console.lighton.ai
LightOn tweet mediaLightOn tweet mediaLightOn tweet media
English
0
5
12
525
LightOn retweetledi
LightOn
LightOn@LightOnIO·
The plumbing era of RAG is over. Parsing. Chunking. Embeddings. Reranking. Search. Five moving parts, one fragile seam between each, and a roadmap quietly disappearing into the gaps. The teams shipping fastest have moved up the stack: 🎯 Domain expertise : the part a competitor cannot copy ⚡️ Product velocity : features instead of chunk-size debugging Retrieval infrastructure is solved. The moat is what you build on top of it. Full post → lighton.ai/lighton-blogs/… More from us on this, very soon.
LightOn tweet media
English
1
10
33
2.6K
LightOn retweetledi
LightOn
LightOn@LightOnIO·
The data you need to benchmark enterprise RAG is the data no one shares. So we built it. Today we're releasing EDiTh on @huggingface , an open benchmark for enterprise retrieval, designed around the questions executives actually ask. What's inside: 📄 1,004 PDFs across 6 languages and 3 formats 🎯 36 use cases with full answer keys 🎭5.5% plausible distractors (the kind that fool real systems) 🏢 All built around Véracier Industries, a fictional €1.8B industrial group Each use case mirrors a real executive question, the kind that usually takes a team, a week, and a stack of PDFs to answer. If you're building, buying, or evaluating RAG for the enterprise, EDiTh gives you something the public benchmarks don't: realistic documents, realistic noise, realistic stakes. Built by Adèle Guignochau and @IgorCarron at @LightOnIO . Read the release 👉🏻 lighton.ai/lighton-blogs/… Dataset on Hugging Face 👉🏻 huggingface.co/datasets/light… #EnterpriseAI #RAG #SovereignAI 📄
LightOn tweet media
English
1
8
36
2.7K
LightOn retweetledi
LightOn
LightOn@LightOnIO·
One API to run them all. Agents do not ask nicely. They dump raw PDFs, garbled tables, and off-domain questions into the same thread. 📄Parsing: LightOnOCR-2 turns scans, tables, handwriting, and multi-column layouts into structured output. 20+ languages natively. Send the raw file. No Unstructured, no LayoutLM, no glue code. 🧠Extraction: Pull any key-value, entity, or field you care about straight from documents and images. Define the schema, get structured JSON back. 🔍Search: Dense embeddings, sparse lexical terms, and NextPlaid late-interaction vectors, powered by LateOn, which generalizes across domains with no fine-tuning. One query hits all three signals. The index picks the signal; you do not pick the index. One API key. One dashboard. One spend cap. Three endpoints. Zero pipeline maintenance. claim your access : console.lighton.ai
LightOn tweet media
English
1
5
17
2.8K
LightOn
LightOn@LightOnIO·
Retrieval is solved. One API to feed any model, power any agent. Three endpoints, zero config: /parse /extract /search LightOn Console dropping soon. Sandbox access, ship as you sign up! 🔗 console.lighton.ai
LightOn tweet media
English
1
20
111
2.8M