Yuval Reif

24 posts

Yuval Reif

Yuval Reif

@YuvalReif

PhD student @ The Hebrew University | Focused on interpretability and tokenization in LLMs #NLProc

Bergabung Ağustos 2022
102 Mengikuti66 Pengikut
Tweet Disematkan
Yuval Reif
Yuval Reif@YuvalReif·
Is dataset debiasing the right path to robust models? In our work, “Fighting Bias with Bias”, we argue that in order to promote model robustness, we should in fact amplify biases in training sets. w/ @royschwartzNLP In #ACL2023NLP Findings Paper: arxiv.org/abs/2305.18917 🧵👇
Yuval Reif tweet media
English
2
19
55
5.2K
Yuval Reif me-retweet
Itay Itzhak @ ICLR 🇧🇷
Itay Itzhak @ ICLR 🇧🇷@Itay_itzhak_·
Ever used a top-ranked LLM that just... felt wrong for you? You’re not alone. Instead of leaderboards, many of us turn to "vibe-testing" - manually comparing models to our own needs. But can we turn these feelings into a structured evaluation? New paper: "From Feelings to Metrics" 🧵
Itay Itzhak @ ICLR 🇧🇷 tweet media
English
2
16
36
3.7K
Yuval Reif me-retweet
Hadas Orgad @ ICLR
Hadas Orgad @ ICLR@OrgadHadas·
New paper: LLMs encode harmful content generation in a distinct, unified mechanism Using weight pruning, we find that harmful generation depends on a tiny subset of the weights that are shared across harm types and separate from benign capabilities. 🧵
Hadas Orgad @ ICLR tweet media
English
6
47
249
38K
Yuval Reif me-retweet
Itay Itzhak @ ICLR 🇧🇷
Itay Itzhak @ ICLR 🇧🇷@Itay_itzhak_·
🚨New paper alert🚨 🧠 Instruction-tuned LLMs show amplified cognitive biases — but are these new behaviors, or pretraining ghosts resurfacing? Excited to share our new paper, accepted to CoLM 2025🎉! See thread below 👇 #BiasInAI #LLMs #MachineLearning #NLProc
Itay Itzhak @ ICLR 🇧🇷 tweet media
English
3
24
81
6.5K
Yuval Reif me-retweet
Noy Sternlicht
Noy Sternlicht@NoySternlicht·
🚨 New paper! We present CHIMERA — a KB of 28K+ scientific idea recombinations 💡 It captures how researchers blend concepts or take inspiration across fields, enabling: 1. Meta-science 2. Training models to predict new combos noy-sternlicht.github.io/CHIMERA-Web 👇 Findings & data:
GIF
English
4
23
62
4.4K
Yuval Reif me-retweet
Michael Hassid
Michael Hassid@MichaelHassid·
The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”. Link: arxiv.org/abs/2505.17813 1/n
Michael Hassid tweet media
English
7
35
113
8.5K
Yuval Reif me-retweet
Guy Kaplan
Guy Kaplan@GKaplan38844·
Heading to @iclr_conf ✈️🧩 ‘Tokens→Words’ shows how LLMs build full‑word representations from sub‑word tokens and offers a tool for vocab expansion. 🚀 See our #ICLR2025 poster ‑ 26.4, 15:00‑17:30. 📄 arxiv.org/abs/2410.05864 🔗 guykap12.github.io/FromTokens2Wor… 👇
Guy Kaplan tweet media
Guy Kaplan@GKaplan38844

📢Paper release📢 : 🔍 Ever wondered how LLMs understand words when all they see are tokens? 🧠 Our latest study uncovers how LLMs reconstruct full words from sub-word tokens, even when misspelled or previously unseen. arxiv.org/pdf/2410.05864 (preprint) 👀 👇 [1/7]

English
0
6
39
2.2K
Yuval Reif me-retweet
Sheridan Feucht
Sheridan Feucht@sheridan_feucht·
[📄] Are LLMs mindless token-shifters, or do they build meaningful representations of language? We study how LLMs copy text in-context, and physically separate out two types of induction heads: token heads, which copy literal tokens, and concept heads, which copy word meanings.
Sheridan Feucht tweet media
English
2
38
195
41.3K
Yuval Reif me-retweet
Nitzan Barzilay
Nitzan Barzilay@Nitzan_Barzilay·
אני רוצה לרכז שמות של עמותות שנשמח כמשפחה כשיתרמו להן לזכרו של נועם אחי. ספרו לי על עמותות ויוזמות ישראליות שעושות עבודה טובה בתחום בריאות הנפש
עברית
0
2
12
479
Yuval Reif me-retweet
Hadas Orgad @ ICLR
Hadas Orgad @ ICLR@OrgadHadas·
Hallucinations are a subject of much interest, but how much do we know about them? In our new paper, we found that the internals of LLMs contain far more information about truthfulness than we knew! 🧵 Project page >> llms-know.github.io Arxiv >> arxiv.org/abs/2410.02707
Hadas Orgad @ ICLR tweet media
English
7
41
720
133.5K
Yuval Reif me-retweet
Guy Kaplan
Guy Kaplan@GKaplan38844·
📢Paper release📢 : 🔍 Ever wondered how LLMs understand words when all they see are tokens? 🧠 Our latest study uncovers how LLMs reconstruct full words from sub-word tokens, even when misspelled or previously unseen. arxiv.org/pdf/2410.05864 (preprint) 👀 👇 [1/7]
Guy Kaplan tweet media
English
5
22
54
5.8K
Yuval Reif me-retweet
Amit Ben-Artzy
Amit Ben-Artzy@Amit_BenArtzy·
In which layers does information flow from previous tokens to the current token? Presenting our new @BlackboxNLP paper: “Attend First, Consolidate Later: On the Importance of Attention in Different LLM Layers” arxiv.org/abs/2409.03621 1/n
Amit Ben-Artzy tweet media
English
1
19
68
4.7K
Yuval Reif me-retweet
Aviv Slobodkin @NeurIPS
Aviv Slobodkin @NeurIPS@lovodkin93·
Ever skimmed an article, pinpointing key info, and wished for a tailor-made summary without crafting it yourself?🤔 Introducing SummHelper: your go-to for personalized summarization. 📜✏️ w/ Niv Nachum @pyshmulik @obspp18 Ido Dagan 1/n
English
2
19
36
3.7K
Yuval Reif me-retweet
Itay Itzhak @ ICLR 🇧🇷
Itay Itzhak @ ICLR 🇧🇷@Itay_itzhak_·
📢 New paper alert! 📢 Thrilled to announce `Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias'. Do instruction tuning and RLHF amplify biases in LMs? 🧵 Check it out arxiv.org/abs/2308.00225 W @boknilev @GabiStanovsky and N. Rosenfeld.
Itay Itzhak @ ICLR 🇧🇷 tweet media
English
4
34
99
16.2K
Yuval Reif me-retweet
Yo Shavit
Yo Shavit@yonashav·
The data used to train an AI model is vital to understanding its capabilities and risks. But how can we tell whether a model W actually resulted from a dataset D? In a new paper, we show how to verify models' training-data, incl the data of open-source LMs!arxiv.org/abs/2307.00682
English
5
68
371
151.6K
Yuval Reif me-retweet
Netta Madvil
Netta Madvil@NettaMadvil·
Read🧐, Look 👀 or Listen🎧? What’s needed to solve a multimodal dataset. 📣Excited to share our two-step method that maps each instance in a multimodal dataset to the modalities required for processing it. w. @YonatanBitton @royschwartzNLP Paper📄: arxiv.org/abs/2307.04532 1/5
Netta Madvil tweet media
English
2
21
28
2.4K
Yuval Reif
Yuval Reif@YuvalReif·
We also compare automatically-extracted anti-biased test splits to manually curated challenge sets (HANS and PAWS). Anti-biased sets are as hard as challenge sets, but capture a more diverse set of biases (occurring in all task labels, vs. mostly in one label). 7/
Yuval Reif tweet media
English
1
0
1
134
Yuval Reif
Yuval Reif@YuvalReif·
Is dataset debiasing the right path to robust models? In our work, “Fighting Bias with Bias”, we argue that in order to promote model robustness, we should in fact amplify biases in training sets. w/ @royschwartzNLP In #ACL2023NLP Findings Paper: arxiv.org/abs/2305.18917 🧵👇
Yuval Reif tweet media
English
2
19
55
5.2K