Fabian David Schmidt

Vaibhav Adlakha@vaibhav_adlakha

18

100

12.8K

Fabian David Schmidt retweetet

Siva Reddy@sivareddyg·12 Mar

LLM2Vec-Gen represents a major paradigm shift for embeddings/retrieval. Why encode the query when the LLM already knows what to look for and can directly produce an embedding for it? Best part: it’s self-supervised, and it does all of this while the LLM remains completely frozen. Think about it: "solve x² + 3x − 4 = 0" has zero reasoning in it. But the LLM's response does. By encoding the response, the embedding captures the reasoning --- and the better the LLM reasons, the better the embedding. This is why our results scale with model size. As LLMs get smarter, our embeddings automatically get better. LLM2Vec-Gen is also the first demonstration of the promise of @ylecun's JEPA for text embeddings. The alignment loss is JEPA — predict in representation space, not token space. The reconstruction loss goes beyond --- it keeps embeddings decodable. This paradigm shift opens new frontiers: 🔬 Can we build a full JEPA for language where the teacher and student are the same LLM? ⚡ Can LLMs reason in compressed space without ever generating text? 🤖 Can agents reason in compression tokens and carry that directly into retrieval? 💬 Can agents talk to each other in compression tokens instead of text --- dense, fast, and still human-readable? LLM2Vec-Gen is a first step toward all four.

Your LLM already knows the answer. Why is your embedding model still encoding the question? 🚨Introducing LLM2Vec-Gen: your frozen LLM generates the answer's embedding in a single forward pass — without ever generating the answer. Not only that, the frozen LLM can decode the embedding back into text. 🏆 SOTA self-supervised embeddings 🛡️ Free transfer of instruction-following, safety, and reasoning

English

7

27

171

21.5K

Fabian David Schmidt retweetet

Marius Mosbach@mariusmosbach·12 Mar

Checkout our latest work on building self-supervised text embeddings without relying on contrastive data. ☝️ The main idea behind LLM2Vec-Gen is trying to encode a model's answer to a query, rather than the query itself.

Vaibhav Adlakha@vaibhav_adlakha

Your LLM already knows the answer. Why is your embedding model still encoding the question? 🚨Introducing LLM2Vec-Gen: your frozen LLM generates the answer's embedding in a single forward pass — without ever generating the answer. Not only that, the frozen LLM can decode the embedding back into text. 🏆 SOTA self-supervised embeddings 🛡️ Free transfer of instruction-following, safety, and reasoning

English

5

27

1.5K

Fabian David Schmidt retweetet

Vaibhav Adlakha@vaibhav_adlakha·12 Mar

Your LLM already knows the answer. Why is your embedding model still encoding the question? 🚨Introducing LLM2Vec-Gen: your frozen LLM generates the answer's embedding in a single forward pass — without ever generating the answer. Not only that, the frozen LLM can decode the embedding back into text. 🏆 SOTA self-supervised embeddings 🛡️ Free transfer of instruction-following, safety, and reasoning

GIF

English

4

36

190

48.9K

Fabian David Schmidt retweetet

Andreea Iana@iana_andreea·6 Mar

📢 Call for Papers📢 Working on user-centered #news #recsys or their legal & ethical dimensions? 👉 Submit to the 14th @NewsRecWorkshop co-located w/ @UMAPconf in Gothenburg! 🗓️Paper deadline: April 9, 2026 More info: research.idi.ntnu.no/NewsTech/INRA/… #INRA2026 #UMAP2026

English

2

180

Fabian David Schmidt retweetet

Marius Mosbach@mariusmosbach·18 Şub

Check out our new preprint on the superficial alignment hypothesis (SAH). 👇 We operationalize the SAH via the length of the shortest program that achieves a certain performance on a task, unifying previous views on the SAH and showing how post-training affects "superficiality".

tom@tvergarabrowne

first paper of the phd 🥳 the Superficial Alignment Hypothesis (SAH) argues that pre-training adds most of the knowledge to a model, and post-training merely surfaces it. however, this hypothesis has lacked a precise definition. we fix this.

English

2

8

720

Fabian David Schmidt retweetet

Cohere Labs@Cohere_Labs·17 Şub

Introducing ✨Tiny Aya✨, a family of massively multilingual small language models built to run where people actually are. Tiny Aya delivers strong multilingual performance in 70+ global languages in a 3.35B parameter model, efficient enough to run locally, even on a phone.

English

30

158

859

185.4K

Fabian David Schmidt retweetet

Desmond Elliott@delliott·3 Şub

📢I am hiring a highly-motivated Ph.D student at the University of Copenhagen, in Denmark🇩🇰, to work on tokenization-free NLP. See our previous work in this topic: aclanthology.org/2025.emnlp-mai… aclanthology.org/2023.emnlp-mai… openreview.net/forum?id=FkSp8… Apply by March 8: employment.ku.dk/phd/?show=1563…

English

Michael Rizvi-Martel@frisbeemortel

49

221

22.7K

Fabian David Schmidt retweetet

Michael Rizvi-Martel@frisbeemortel·26 Oca

Excited to announce our work on multi-agent systems has been accepted to #ICLR2026! Looking forward to seeing everyone in Rio :) 🇧🇷

Is there such a thing as too many agents in multi-agent systems? It depends! 🧵 Our work reveals 3 distinct regimes where communication patterns differ dramatically. More on our findings below 👇 (1/7)

English

Carlsbergfondet@Carlsbergfondet

3

21

1.4K

Fabian David Schmidt retweetet

Desmond Elliott@delliott·16 Ara

I am grateful that the Carlsberg Foundation is supporting our basic research on tokenization-free language models at the University of Copenhagen. I will be hiring Ph.D students to start in September 2026. Feel free to reach out early if you want to express informal interest.

Fra politologi til arkæologi. Fra astrofysik til marinbiologi og glaciologi. 159 forskere modtager i dag en bevilling fra Carlsbergfondet til vidt forskellige grundvidenskabelige initiativer. Se hvilke projekter, der har fået støtte 👉bit.ly/4iK2fV2 #dkforsk

English

7

25

2.1K

Fabian David Schmidt retweetet

Cohere@cohere·11 Ara

Introducing our latest breakthrough in AI search and retrieval: Rerank 4! It’s the most advanced set of reranking models on the market, with best-in-class performance across search relevance, speed, deployment flexibility, multilingual support, and domain-specific understanding.

English

11

51

170

41.1K

Fabian David Schmidt retweetet

Josip Jukic@chatruncata·3 Ara

Presenting our paper "Disentangling Latent Shifts of In-Context Learning with Weak Supervision" (with Jan Šnajder) at NeurIPS 2025, San Diego: 🗓 Fri, Dec 5 · 11:00–14:00 PST 📍 Exhibit Hall C/D/E · Poster #2615 Paper: openreview.net/pdf?id=tAq9Gxd… #NeurIPS2025

English

1

7

684

Fabian David Schmidt retweetet

Verna Dankers@vernadankers·7 Kas

Ready for day 3 of #EMNLP2025 🎉🎉 I've been on the lookout for memorization, unlearning, interp, memory module papers & more, chat w me if these topics fascinate you too😻 Looking forward to more of Suzhou, the conf & my BlackboxNLP keynote Sunday 1.45PM! blackboxnlp.github.io/2025/

English

10

57

5.1K

Fabian David Schmidt retweetet

Mehar Bhatia@bhatia_mehar·4 Kas

🚨How do LLMs acquire human values?🤔 We often point to preference optimization. However, in our new work, we trace how and when model values shift during post-training and uncover surprising dynamics. We ask: How do data, algorithms, and their interaction shape model values?🧵

English

2

46

127

39.6K

Fabian David Schmidt retweetet

Tiancheng Hu@tiancheng_hu·30 Eki

Instruction tuning unlocks incredible skills in LLMs, but at a cost: they become dangerously overconfident. You face a choice: a well-calibrated base model or a capable but unreliable instruct model. What if you didn't have to choose? What if you could navigate the trade-off? (1/8)

GIF

English

Multilingual Representation Workshop @ EMNLP 2025@mrl_workshop

4

14

1.1K

Fabian David Schmidt retweetet

Catherine Arnett@linguist_cat·29 Eki

I’m so excited that Global PIQA is out! This has been a herculean effort by our 300+ contributors. The result is an extremely high-quality, culturally-specific benchmark for over 100 languages.

Introducing Global PIQA, a new multilingual benchmark for 100+ languages. This benchmark is the outcome of this year’s MRL shared task, in collaboration with 300+ researchers from 65 countries. This dataset evaluates physical commonsense reasoning in culturally relevant contexts.

English

7

35

4.5K

Fabian David Schmidt retweetet

Multilingual Representation Workshop @ EMNLP 2025@mrl_workshop·29 Eki

Introducing Global PIQA, a new multilingual benchmark for 100+ languages. This benchmark is the outcome of this year’s MRL shared task, in collaboration with 300+ researchers from 65 countries. This dataset evaluates physical commonsense reasoning in culturally relevant contexts.

Multilingual Representation Workshop @ EMNLP 2025 tweet media

English

2

57

114

26.5K

Fabian David Schmidt retweetet

Valentina Pyatkin@valentina__py·27 Eki

I will be giving a talk at @ETH_AI_Center next week, on RLVR for verifiable instruction following, generalization, and reasoning! 📢 Join if you are in Zurich and interested in hearing about IFBench and our latest Olmo and Tülu works at @allen_ai

English

9

109

10.2K

Fabian David Schmidt retweetet

Marius Mosbach@mariusmosbach·24 Eki

Come talk to @Ara_Krishnan and me about our recent paper on frequency effects of unlearning and how @allen_ai 's Olmo model and toolkit made this work so much easier. 🚀

Ai2@allen_ai

Olmo isn’t just open weights—it’s an open research stack. Try it in the Ai2 Playground: playground.allenai.org AMA on Discord: Tues, Oct 28 @ 8:00 AM PT with some of the researchers behind these studies + an Ai2 Olmo teammate. Join: discord.gg/ai2

English

Michael Rizvi-Martel@frisbeemortel

5

19

1.5K

Fabian David Schmidt retweetet

Marius Mosbach@mariusmosbach·17 Eki

If you are thinking a lot about CoT and multi-agent communication these days, check out Michael's work below 👇. And make sure to keep an eye on his work going forward, more great things to come! 👨🏻‍🍳

Is there such a thing as too many agents in multi-agent systems? It depends! 🧵 Our work reveals 3 distinct regimes where communication patterns differ dramatically. More on our findings below 👇 (1/7)

English

3

10

1.5K

Fabian David Schmidt retweetet

Michael Rizvi-Martel@frisbeemortel·17 Eki

Is there such a thing as too many agents in multi-agent systems? It depends! 🧵 Our work reveals 3 distinct regimes where communication patterns differ dramatically. More on our findings below 👇 (1/7)

English