Pierre Colombo

2

10

1.4K

Pierre Colombo retweetledi

Manuel Faysse@ManuelFaysse·8h

🚨 Do LLMs need to store everything they read in memory? To reduce KV cache size and improve decoding speeds, we propose Self-Pruned KV attention, a mechanism where the model learns to decide which KVs to write in the persistent KV cache, discarding all the rest! @AIatMeta🧵

English

5

30

125

8.6K

Pierre Colombo retweetledi

Nicolas Boizard@N1colAIs·4d

@JinaAI_ has hopped on the omnimodal train🚂 They just dropped a collection of two Omni embedding models (0.9B & 2B). Similar to BidirLM, they seem to rely on the Qwen modality head for the larger one, while sticking with EuroBERT for the nano version 🥰 huggingface.co/collections/ji…

English

2

16

1.8K

Pierre Colombo retweetledi

Nicolas Boizard@N1colAIs·24 Nis

BidirLM-Omni is on MTEB and Sentence-Transformer! huggingface.co/spaces/mteb/le… 🥇#1 Open-Source Model on MTEB (#15 overall) 🖼️#1 across all sizes on MIEB (Image) 🎧#1 sub-7B model on MAEB (Audio, #2 overall) Small size, massive performance, Fully open Model: huggingface.co/BidirLM

tomaarsen@tomaarsen

BidirLM-Omni-2.5B-Embedding is live: a single bidirectional encoder that embeds text, images, and audio into the same space! Three modalities, all in one 2048-dim space. 🧵

English

2

7

29

2.2K

Pierre Colombo retweetledi

Nicolas Boizard@N1colAIs·25 Nis

We are currently presenting 'Should We Still Pretrain Encoders with Masked Language Modeling?' Come see us in Hall 3 #1304 @iclr_conf arxiv.org/abs/2507.00994

English

7

69

9K

Pierre Colombo retweetledi

Antoine Chaffin@antoine_chaffin·23 Nis

If you're at ICLR, come say hi to @orionweller and @N1colAIs! And also, shot-out to all the French people pushing out the encoder architecture, it seems like, as for ColBERT, French taste is unmatched! (Non exhaustive list, pardon me but Twitter search is bad): @gisship @PierreColombo6 @ManuelFaysse @pteiletche @MaceQuent1 @mlpc123

English

DailyPapers@HuggingPapers

1

8

432

Pierre Colombo@PierreColombo6·19 Nis

Great work from @gisship and @N1colAIs from @centralesupelec

BERT-as-a-Judge A robust alternative to rigid lexical matching for LLM evaluation. Matches the performance of LLM-as-a-Judge at a fraction of the computational cost.

English

1

3

254

Pierre Colombo retweetledi

DailyPapers@HuggingPapers·19 Nis

BERT-as-a-Judge A robust alternative to rigid lexical matching for LLM evaluation. Matches the performance of LLM-as-a-Judge at a fraction of the computational cost.

English

7

29

252

14.9K

Pierre Colombo retweetledi

Orion Weller@orionweller·16 Nis

Encoders are so much better for classification, why not use them for judging? Awesome study from @N1colAIs - cool to see a 210m BERT model beating much larger Qwen and Gemma models.

What’s inside the release: 🔌 Plug & play BERT-as-a-judge model: huggingface.co/collections/ar… 🛠️ Support to train your own custom evaluators: github.com/artefactory/BE… 📄 Study on the limits of lexical methods: arxiv.org/pdf/2604.09497

English

6

69

8.1K

Pierre Colombo@PierreColombo6·15 Nis

Evaluation is underrated. If your eval signal is noisy, you're flying blind. BERT-as-a-Judge gives you a fast, cheap way to improve your signal-to-noise ratio without spinning up a full LLM judge. Exactly the kind of infra work that compounds. @gisship @N1colAIs congrats!

🎉 Second paper this month! Introducing BERT-as-a-Judge (x @gisship) ⚖️ Evaluating LLMs with rigid lexical methods often fails right answers due to bad formatting. While "LLM-as-a-Judge" solves this, it remains costly & slow. Our fix? A lightweight, encoder-driven approach.

English

4

467

Pierre Colombo retweetledi

Nicolas Boizard@N1colAIs·15 Nis

🎉 Second paper this month! Introducing BERT-as-a-Judge (x @gisship) ⚖️ Evaluating LLMs with rigid lexical methods often fails right answers due to bad formatting. While "LLM-as-a-Judge" solves this, it remains costly & slow. Our fix? A lightweight, encoder-driven approach.

English

16

118

7K

Pierre Colombo retweetledi

Niklas Muennighoff@Muennighoff·11 Nis

There's a wave of omni embedding models (gemini, nemotron, bidirlm). Excited to support this trend with our multimodal mteb versions (mieb, maeb) - video coming soon🎥

🚀 New model family release with an OMNIMODAL version ! After Eurobert, I'm excited to introduce BidirLM, a family of 5 frontier bidirectional encoders including an OMNIMODAL encoder at just 2.5B parameters. 🧵👇 huggingface.co/BidirLM

English

Niklas Muennighoff@Muennighoff

13

62

9.9K

Pierre Colombo@PierreColombo6·11 Nis

Omni embeddings are becoming the new standard. Glad to see @N1colAIs @Muennighoff pushing multimodal eval forward with MIEB & MAEB — can't wait for the video!

There's a wave of omni embedding models (gemini, nemotron, bidirlm). Excited to support this trend with our multimodal mteb versions (mieb, maeb) - video coming soon🎥

English

2

5

644

Pierre Colombo retweetledi

jonah@drexalt·8 Nis

They even released the base bidirectional models 😍 Great release, thanks for all the checkpoints ♥️

🚀 New model family release with an OMNIMODAL version ! After Eurobert, I'm excited to introduce BidirLM, a family of 5 frontier bidirectional encoders including an OMNIMODAL encoder at just 2.5B parameters. 🧵👇 huggingface.co/BidirLM

English

2

7

444

Pierre Colombo retweetledi

Antoine Chaffin@antoine_chaffin·8 Nis

The world needs more encoders Turning decoders into encoders is a very strong path forward considering the edge of public decoders (see Ettin/previous work from Nicolas) Happy to see more work towards this in the omni setup and also public models!! Can’t wait to try them out

🚀 New model family release with an OMNIMODAL version ! After Eurobert, I'm excited to introduce BidirLM, a family of 5 frontier bidirectional encoders including an OMNIMODAL encoder at just 2.5B parameters. 🧵👇 huggingface.co/BidirLM

English

2

7

40

3.4K

Pierre Colombo retweetledi

Nicolas Boizard@N1colAIs·8 Nis

🚀 New model family release with an OMNIMODAL version ! After Eurobert, I'm excited to introduce BidirLM, a family of 5 frontier bidirectional encoders including an OMNIMODAL encoder at just 2.5B parameters. 🧵👇 huggingface.co/BidirLM

English

5

11

55

15K

Pierre Colombo retweetledi

Nicolas Boizard@N1colAIs·8 Nis

📦 Models & data: huggingface.co/BidirLM 📝 Blog: huggingface.co/blog/Nicolas-B… 📄 Paper: arxiv.org/abs/2604.02045 Joint work with @TheoDescha33800, @gisship , @PierreColombo6 , @CelineHudelot 🙌 Kudos @drexalt @_reachsumit who spotted paper 4 days before release and kindly shared it

English

10

292

Pierre Colombo retweetledi

Manuel Faysse@ManuelFaysse·15 Şub

Most practicionners would agree that text embeddings should be "contextual" - ie. they should encode a passage w.r.t. the wider scope of the entire document the passage stems from; "They beat the British" could refer to football or french history without further context... In ConTEB (arxiv.org/abs/2505.24782), we highlight the standard failure modes of embedding models on retrieval tasks that require context to be properly embedded. We also propose a training strategy that extends standard "late chunking" to teach models to infuse embeddings with just the right amount of contextual knowledge to optimize retrieval. Super happy to see some new work by @perplexity_ai on contextual embedding models. They eval on ConTEB and use our in-sequence contrastive loss, along with a ton of cool techniques in multiple phases of training. Love the work @bo_wangbo and will read in details, but super happy to see one more stone towards contextual embedding models, in the path already traveled by @hxiao and @jxmnop ! Link to the paper: arxiv.org/abs/2602.11151…

English

2

6

37

2K

Pierre Colombo retweetledi

Manuel Faysse@ManuelFaysse·31 Ara

In August, I joined FAIR at Meta in @hjegou's group for an end of thesis internship. I can't talk much for the moment about what we have been doing (hint: not retrieval), but it's very exciting and I am having lots of fun working with great people! (13/15)

English

Matthieu Meeus@matthieu_meeus

2

148

Pierre Colombo retweetledi

Manuel Faysse@ManuelFaysse·31 Ara

Other great news, our paper on designing rigorous Membership Inference attacks against LLMs led by @matthieu_meeus won Best Paper Award at IEEE SatML! (12/15) x.com/matthieu_meeus…

Are membership inference attacks (MIAs) against LLMs rushing nowhere? 🏃‍ ➡️ In a new SoK, we look at how things have evolved recently, show popular evaluation setups to be flawed, and examine solutions going forward.

English

2

177

Pierre Colombo retweetledi

Manuel Faysse@ManuelFaysse·31 Ara

In a follow-up project, we carefully investigate the differences between Masked Language Modeling (encoder) and Next Token Prediction (decoder) objectives to produce text representations and uncover many nice insights into training efficiency. (11/15) arxiv.org/abs/2507.00994

English