Workshop on Large Language Model Memorization

23 posts

Workshop on Large Language Model Memorization banner
Workshop on Large Language Model Memorization

Workshop on Large Language Model Memorization

@l2m2_workshop

The First Workshop on Large Language Model Memorization.

World Katılım Eylül 2024
12 Takip Edilen141 Takipçiler
Workshop on Large Language Model Memorization retweetledi
Niloofar
Niloofar@niloofar_mire·
I'm psyched for my 2 *different* talks on Friday @aclmeeting: 1.@llm_sec (11:00): What does it mean for an AI agent to preserve privacy? 2.@l2m2_workshop (16:00): Emergent Misalignment thru the Lens of Non-verbatim Memorization (& phonetic to visual attacks!) Join us!
Niloofar tweet mediaNiloofar tweet mediaNiloofar tweet mediaNiloofar tweet media
English
1
9
86
5.7K
Workshop on Large Language Model Memorization retweetledi
Yanai Elazar
Yanai Elazar@yanaiela·
I'll be at #ACL2025 next week! Catch me at the poster sessions, eating sachertorte, schnitzel and speaking about distributional memorization at the @l2m2_workshop
Yanai Elazar tweet media
English
1
8
90
5K
Workshop on Large Language Model Memorization retweetledi
Ai2
Ai2@allen_ai·
For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting? Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light. 🔦
English
17
130
621
171.9K
Workshop on Large Language Model Memorization retweetledi
Tom McCoy
Tom McCoy@RTomMcCoy·
Do language models just copy text they've seen before, or do they have generalizable abilities? ⬇️This new tool from Ai2 will be very useful for such questions! And allow me to plug our paper on this topic: We find that LLMs are mostly not copying! direct.mit.edu/tacl/article/d… 1/2
Tom McCoy tweet media
Ai2@allen_ai

For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting? Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light. 🔦

English
1
6
74
7.8K
Workshop on Large Language Model Memorization retweetledi
Jiacheng Liu
Jiacheng Liu@liujc1998·
As infini-gram surpasses 500 million API calls, today we're announcing two exciting updates: 1. Infini-gram is now open-source under Apache 2.0! 2. We indexed the training data of OLMo 2 models. Now you can search in the training data of these strong, fully-open LLMs. 🧵 (1/4)
English
2
12
65
6.5K
Workshop on Large Language Model Memorization retweetledi
Abhilasha Ravichander
Abhilasha Ravichander@lasha_nlp·
Want to know what training data has been memorized by models like GPT-4? We propose information-guided probes, a method to uncover memorization evidence in *completely black-box* models, without requiring access to 🙅‍♀️ Model weights 🙅‍♀️ Training data 🙅‍♀️ Token probabilities 🧵1/5
Abhilasha Ravichander tweet media
English
4
40
205
28.1K
Workshop on Large Language Model Memorization
Hey all, we will be retweeting works on memorization. Please DM us if you want us to retweet your work. Our submission deadline is 4/15, consider submitting to one of our archival or non-archival tracks!
English
1
0
0
94
Workshop on Large Language Model Memorization retweetledi
Niloofar
Niloofar@niloofar_mire·
Adding or removing PII in LLM training can *unlock previously unextractable* info. Even if “John.Mccarthy” never reappears, enough Johns & Mccarthys during post-training can make it extractable later! New paper on PII memorization & n-gram overlaps: arxiv.org/abs/2502.15680
Niloofar tweet media
English
4
12
84
5.8K
Workshop on Large Language Model Memorization retweetledi
Ashwinee Panda
Ashwinee Panda@PandaAshwinee·
we show for the first time ever how to privacy audit LLM training. we give new SOTA methods that show how much models can memorize. by using our methods, you can know beforehand whether your model is going to memorize its training data, and how much, and when, and why! (1/n 🧵)
Ashwinee Panda tweet media
English
1
22
126
14.2K
Workshop on Large Language Model Memorization
📝 Key Deadlines: - ARR submission: Feb 15, 2025 - Direct submission: Mar 25, 2025 - ARR commitment: Apr 17, 2025 - Notification: Apr 27, 2025 - Camera-ready: May 16, 2025
English
1
0
4
159
Workshop on Large Language Model Memorization
📢The First Workshop on Large Language Model Memorization (L2M2) will be co-located with @aclmeeting in Vienna🎉 💡L2M2 brings together researchers to explore memorization from multiple angles. Whether it's text-only LLMs or Vision-language models, we want to hear from you!🌍
English
1
3
13
4.5K