Pulkit Gopalani

13 posts

Pulkit Gopalani

Pulkit Gopalani

@GopalaniPulkit

Research intern @IFM_MBZUAI | PhD candidate @UMichCSE | prev. @IITKanpur

San Francisco Bay Area Katılım Nisan 2023
932 Takip Edilen102 Takipçiler
Sabitlenmiş Tweet
Pulkit Gopalani
Pulkit Gopalani@GopalaniPulkit·
Excited to announce our recent work on understanding training-time emergence in Transformers! Thread🧵(1/11)
Pulkit Gopalani tweet media
English
2
8
37
9.2K
Pulkit Gopalani retweetledi
Yongyi Yang
Yongyi Yang@YongyiYang7·
What drives in-context learning in LLMs? New paper: Provable Low-Frequency Bias of In-Context Learning of Representations. We show LLMs have a low-frequency bias when learning representations in context, offering a theoretical answer to several previously open questions. 🧵👇
English
1
8
28
5.7K
Stephanie Chan
Stephanie Chan@scychan_brains·
Emergence in transformers is a real phenomenon! Behaviors and capabilities can appear in models in sudden ways. Emergence is not always just a "mirage". Compiling some examples here (please share any I missed): 🧵
English
11
40
353
30.9K
Pulkit Gopalani
Pulkit Gopalani@GopalaniPulkit·
Repetitive sequences are easy for Transformers: we show that training on sequences like ‘x_1, x_2, …, x_n, [sep] x_1, x_1, …, x_1’ (or other similar sequences) does not involve loss plateaus like other algorithmic tasks, and the loss converges in a few training steps. (10/11)
Pulkit Gopalani tweet media
English
1
0
0
1.1K
Pulkit Gopalani
Pulkit Gopalani@GopalaniPulkit·
Excited to announce our recent work on understanding training-time emergence in Transformers! Thread🧵(1/11)
Pulkit Gopalani tweet media
English
2
8
37
9.2K