Martin Gubri

1.4K posts

Martin Gubri banner
Martin Gubri

Martin Gubri

@framart1

Research Lead @parameterlab working on Trustworthy AI | he/him Other accounts: 🦋 mgubri | 🐘 @[email protected]

Tübingen, Germany Katılım Aralık 2012
805 Takip Edilen377 Takipçiler
Sabitlenmiş Tweet
Martin Gubri
Martin Gubri@framart1·
🦹💥 How to detect if my LLM was stolen or leaked? 🤖💥 I am delighted to announce TRAP 🪤, our new #ACL2024 findings paper ☝️ We showcase how to use adversarial prompt as model fingerprint for LLM. A thread 🧵 ⬇️⬇️⬇️
Martin Gubri tweet media
English
1
4
23
4K
Martin Gubri
Martin Gubri@framart1·
🌍We've made LLM watermarking equally robust across all languages we studied. Scaling to 100+ languages! Even sota watermarks can be removed by translating to another language, eg. Tamil. This hits hardest in low-resource languages, where moderation tools are already weak. 🧵
Martin Gubri tweet mediaMartin Gubri tweet media
English
1
2
6
241
Martin Gubri
Martin Gubri@framart1·
The D&B track now has a larger scope and a new name: Evaluation & Datasets. It focuses on evaluation itself as a scientific object. It is really nice to have somewhere for critical analysis of evaluation and negative results. It was really missing in ML!
NeurIPS Conference@NeurIPSConf

The Datasets & Benchmarks track is now "Evaluation and Datasets", with an expanded scope for NeurIPS 2026! Read the call for papers neurips.cc/Conferences/20…, and learn more about the changes in our blog post: blog.neurips.cc/2026/03/23/int…

English
0
0
3
133
Martin Gubri
Martin Gubri@framart1·
LLM agents include far more than a model: framework, orchestration, tools, error handling, etc. These harness engineering choices matter, but they're rarely compared. MASEval makes that straightforward. I'm very proud to have supervised its development. Give it a look! ⬇
Cornelius Emde@CorEmde

1/ Evaluating a single agent harness is hard. Evaluating a multi-agent system? That's a whole different problem. Most eval tools treat the model as the unit of analysis. But in multi-agent systems, the system is what matters. That's why we built MASEval 🧵 #Agents #AI #Eval

English
0
0
4
98
Martin Gubri retweetledi
Cornelius Emde
Cornelius Emde@CorEmde·
Great work lead by @anmgoel on how fragile contextual integrity can be in LLMs. This work shows that contextual privacy degrades easily during fine-tuning on benign data and common safety benchmarks don't pick this up. #AISecurity #AIAgents
Anmol Goel@anmgoel

🚨 Fine-tuning your model to be more helpful or empathetic might be making it less private, without you noticing. In our latest work, we show that benign fine-tuning can silently break contextual privacy in language models while safety & general capabilities appear intact. ⬇️

English
0
2
4
432
Guri Singh
Guri Singh@heygurisingh·
@framart1 Fine-tuning your model to be more helpful
English
1
0
0
110
Martin Gubri
Martin Gubri@framart1·
New paper out!🎉 One of our most surprising findings: fine-tuning an LLM on debugging code has unexpected side-effects on contextual privacy. The model learns from printing variables that internal state are ok to share, then generalises this to social situations🤯 A🧵below👇
Martin Gubri tweet media
Anmol Goel@anmgoel

🚨 Fine-tuning your model to be more helpful or empathetic might be making it less private, without you noticing. In our latest work, we show that benign fine-tuning can silently break contextual privacy in language models while safety & general capabilities appear intact. ⬇️

English
1
1
5
536
Martin Gubri retweetledi
Alexander Rubinstein
Alexander Rubinstein@a_rubique·
Happy to share that our new paper was accepted to ICLR 2026! This paper helps people who spend too much time waiting for LLM evaluations on benchmarks like MMLU-Pro. We show how to reduce this time by up to 100×. Big thanks to the coauthors Benjamin Raible, @framart1 @coallaoh!
Alexander Rubinstein@a_rubique

🪩 Evaluate your LLMs on benchmarks like MMLU at 1% cost. In our new paper, we show that outputs on a small subset of test samples that maximise diversity in model responses are predictive of the full dataset performance. Project page: arubique.github.io/disco-site/ More below 🧵👇

English
0
2
10
331
Martin Gubri
Martin Gubri@framart1·
🎉Thrilled to share that both of my #ICLR2026 submissions were accepted (2/2)! 🪩DISCO, Efficient Benchmarking: x.com/a_rubique/stat… 🩺Dr.LLM, Dynamic Layer Routing: x.com/omarsar0/statu… Huge thanks to my co-authors, especially first authors @a_rubique & Ahmed Heakl!
elvis@omarsar0

Dr.LLM: Dynamic Layer Routing in LLMs Neat technique to reduce computation in LLMs while improving accuracy. Routers increase accuracy while reducing layers by roughly 3 to 11 per query. My notes below:

English
0
0
10
402
Martin Gubri
Martin Gubri@framart1·
🧵 Many hidden gems about LLM benchmark contamination in the GAPERON paper! This French-English model paper has some honest findings about how contamination affects benchmarks (and why no one wants to truly decontaminate their training data) Thread 👇
Martin Gubri tweet media
English
1
4
7
669
Alexander Doria
Alexander Doria@Dorialexander·
wtf i finally get quality content on the corposlop network.
Alexander Doria tweet media
English
21
69
1.6K
61K
Martin Gubri
Martin Gubri@framart1·
Full paper: arxiv.org/abs/2510.25771 Key sections: 5.3: Deliberate contamination experiments 7.2.1: Evidence of contamination in existing models 7.2.2: How quality filters amplify leakage 7.2.3 + Appendix C: Game-theoretic modelling
English
1
0
0
63