Martin Gubri

1.4K posts

Martin Gubri

@framart1

Research Lead @parameterlab working on Trustworthy AI | he/him Other accounts: 🦋 mgubri | 🐘 @[email protected]

Tübingen, Germany Katılım Aralık 2012

805 Takip Edilen377 Takipçiler

Sabitlenmiş Tweet

Martin Gubri@framart1·16 May

🦹💥 How to detect if my LLM was stolen or leaked? 🤖💥 I am delighted to announce TRAP 🪤, our new #ACL2024 findings paper ☝️ We showcase how to use adversarial prompt as model fingerprint for LLM. A thread 🧵 ⬇️⬇️⬇️

English

Martin Gubri@framart1·5d

Would love to hear thoughts from people working on watermarking :) cc @ni_jovanovic @sahar_abdelnabi @jonasgeiping

English

Martin Gubri@framart1·5d

Kudos to Asim Mohamed for his first research paper! Paper: arxiv.org/abs/2510.18019 Code: github.com/asimzz/steam

English

Martin Gubri@framart1·5d

🌍We've made LLM watermarking equally robust across all languages we studied. Scaling to 100+ languages! Even sota watermarks can be removed by translating to another language, eg. Tamil. This hits hardest in low-resource languages, where moderation tools are already weak. 🧵

English

241

Martin Gubri@framart1·24 Mar

The D&B track now has a larger scope and a new name: Evaluation & Datasets. It focuses on evaluation itself as a scientific object. It is really nice to have somewhere for critical analysis of evaluation and negative results. It was really missing in ML!

NeurIPS Conference@NeurIPSConf

The Datasets & Benchmarks track is now "Evaluation and Datasets", with an expanded scope for NeurIPS 2026! Read the call for papers neurips.cc/Conferences/20…, and learn more about the changes in our blog post: blog.neurips.cc/2026/03/23/int…

English

133

Martin Gubri@framart1·23 Mar

LLM agents include far more than a model: framework, orchestration, tools, error handling, etc. These harness engineering choices matter, but they're rarely compared. MASEval makes that straightforward. I'm very proud to have supervised its development. Give it a look! ⬇

Cornelius Emde@CorEmde

1/ Evaluating a single agent harness is hard. Evaluating a multi-agent system? That's a whole different problem. Most eval tools treat the model as the unit of analysis. But in multi-agent systems, the system is what matters. That's why we built MASEval 🧵 #Agents #AI #Eval

English

Martin Gubri retweetledi

Cornelius Emde@CorEmde·18 Mar

Great work lead by @anmgoel on how fragile contextual integrity can be in LLMs. This work shows that contextual privacy degrades easily during fine-tuning on benign data and common safety benchmarks don't pick this up. #AISecurity #AIAgents

Anmol Goel@anmgoel

🚨 Fine-tuning your model to be more helpful or empathetic might be making it less private, without you noticing. In our latest work, we show that benign fine-tuning can silently break contextual privacy in language models while safety & general capabilities appear intact. ⬇️

English

432

Martin Gubri retweetledi

Stella Biderman@BlancheMinerva·14 Mar

Really great investigatory work

Monk Zero@NoCommas

x.com/i/article/2032…

English

7.3K

Martin Gubri retweetledi

Alexander Rubinstein@a_rubique·25 Şub

This is an important observation. Capitalizing on the low rank of LLM evals, we propose constructing small, representative evaluation subsets that reliably predict full-benchmark performance while drastically reducing evaluation cost. Project page: 👉 arubique.github.io/disco-site/

Dimitris Papailiopoulos@DimitrisPapail

x.com/i/article/2026…

English

2.6K

Martin Gubri@framart1·3 Şub

@heygurisingh Yes, that's another cause of privacy collapse

English

Guri Singh@heygurisingh·3 Şub

@framart1 Fine-tuning your model to be more helpful

English

110

Martin Gubri@framart1·3 Şub

New paper out!🎉 One of our most surprising findings: fine-tuning an LLM on debugging code has unexpected side-effects on contextual privacy. The model learns from printing variables that internal state are ok to share, then generalises this to social situations🤯 A🧵below👇

Anmol Goel@anmgoel

English

536

Martin Gubri retweetledi

Alexander Rubinstein@a_rubique·28 Oca

Happy to share that our new paper was accepted to ICLR 2026! This paper helps people who spend too much time waiting for LLM evaluations on benchmarks like MMLU-Pro. We show how to reduce this time by up to 100×. Big thanks to the coauthors Benjamin Raible, @framart1 @coallaoh!

Alexander Rubinstein@a_rubique

🪩 Evaluate your LLMs on benchmarks like MMLU at 1% cost. In our new paper, we show that outputs on a small subset of test samples that maximise diversity in model responses are predictive of the full dataset performance. Project page: arubique.github.io/disco-site/ More below 🧵👇

English

331

Martin Gubri@framart1·28 Oca

🎉Thrilled to share that both of my #ICLR2026 submissions were accepted (2/2)! 🪩DISCO, Efficient Benchmarking: x.com/a_rubique/stat… 🩺Dr.LLM, Dynamic Layer Routing: x.com/omarsar0/statu… Huge thanks to my co-authors, especially first authors @a_rubique & Ahmed Heakl!

elvis@omarsar0

Dr.LLM: Dynamic Layer Routing in LLMs Neat technique to reduce computation in LLMs while improving accuracy. Routers increase accuracy while reducing layers by roughly 3 to 11 per query. My notes below:

English

402

Martin Gubri@framart1·24 Oca

@zehavoc @nthngdy @wissam_antoun @riantouchent @RABawden @bensagot Thanks to you :)

English

Djamé..@zehavoc·24 Oca

@framart1 @nthngdy @wissam_antoun @riantouchent @RABawden @bensagot Thanks :)

English

Martin Gubri@framart1·23 Oca

🧵 Many hidden gems about LLM benchmark contamination in the GAPERON paper! This French-English model paper has some honest findings about how contamination affects benchmarks (and why no one wants to truly decontaminate their training data) Thread 👇

English

669

Martin Gubri@framart1·24 Oca

@Dorialexander You might find this paper interesting too: aclanthology.org/2025.emnlp-mai… They show that randomly removing 50% of dimensions in text embeddings has minimal impact on retrieval and classification tasks

English

1.2K

Alexander Doria@Dorialexander·23 Oca

wtf i finally get quality content on the corposlop network.

English

1.6K

61K

Martin Gubri@framart1·23 Oca

Kudos to the GAPERON team @nthngdy @wissam_antoun @riantouchent @RABawden, Éric de la Clergerie, @bensagot & Djamé Seddah for the thorough experiments and for saying the quiet parts out loud. We need more papers like this :)

English

Martin Gubri@framart1·23 Oca

Full paper: arxiv.org/abs/2510.25771 Key sections: 5.3: Deliberate contamination experiments 7.2.1: Evidence of contamination in existing models 7.2.2: How quality filters amplify leakage 7.2.3 + Appendix C: Game-theoretic modelling

English

Keşfet

@ni_jovanovic @sahar_abdelnabi @jonasgeiping @anmgoel @heygurisingh @coallaoh @a_rubique @zehavoc