Sam Acquaviva

12 posts

Sam Acquaviva banner
Sam Acquaviva

Sam Acquaviva

@Sam_Acqua

now @ after thought ai ex-cocosci @ mit

Katılım Şubat 2014
211 Takip Edilen119 Takipçiler
Sam Acquaviva retweetledi
Larry Dial
Larry Dial@classiclarryd·
New NanoGPT Speedrun WR at 86.8 (-0.4s) from @.samacqua on Github, by tuning and reusing the transpose_copy kernel during the cross entropy backward calc. Outside the main speedrun track, Sam did an interesting experiment in Jan showing how test-time training can improve perplexity. github.com/KellerJordan/m…. github.com/KellerJordan/m…
English
2
14
74
6.3K
Sam Acquaviva retweetledi
Reece Shuttleworth
Reece Shuttleworth@ReeceShuttle·
🧵 LoRA vs full fine-tuning: same performance ≠ same solution. Our NeurIPS ‘25 paper 🎉shows that LoRA and full fine-tuning, even when equally well fit, learn structurally different solutions and that LoRA forgets less and can be made even better (lesser forgetting) by a simple intervention! Read on for behavioral differences (forgetting, continual learning) and other analysis! Paper: arxiv.org/pdf/2410.21228 (1/7)
Reece Shuttleworth tweet media
English
19
253
1.6K
191.4K
Sam Acquaviva retweetledi
Kevin Ellis
Kevin Ellis@ellisk_kellis·
New paper: World models + Program synthesis by @topwasu 1. World modeling on-the-fly by synthesizing programs w/ 4000+ lines of code 2. Learns new environments from minutes of experience 3. Positive score on Montezuma's Revenge 4. Compositional generalization to new environments topwasu.github.io/poe-world [1/n]
English
16
104
567
57.9K
Sam Acquaviva retweetledi
evanthebouncy
evanthebouncy@evanthebouncy·
I've recently started my job as an asst professor at NTU, Singapore. If you are ever in town come say hi :)
evanthebouncy tweet media
English
28
11
681
37.8K
Lior Pachter
Lior Pachter@lpachter·
Aristotle was the first to notice honeybees dancing. In 1927 Karl von Frisch decoded the waggle. How it works was "explained" by MV Srinivasan AM FRS in the 1990s. Except @NeuroLuebbert found his papers are junk. A 🧵 about her discovery & our report: arxiv.org/abs/2405.12998 1/
English
23
325
1.3K
319.1K
Sam Acquaviva retweetledi
Vedang Lad
Vedang Lad@vedanglad·
1/7 Wondered what happens when you permute the layers of a language model? In our recent paper with @tegmark, we swap and delete entire layers to understand how models perform inference - in doing so we see signs of four universal stages of inference!
Vedang Lad tweet media
English
21
89
547
120.4K
Steve Magness
Steve Magness@stevemagness·
If you think that going 2 beats per minute higher in a workout will fundamentally change it… Or that 3.9 mmol of lactate vs 4.1 on a given day significantly alters the workout… It’s a misguided feeing of precision that isn’t there or meaningful in most cases.
English
2
1
18
7K
Steve Magness
Steve Magness@stevemagness·
Training zones are lines we draw that roughly correspond to a physiological marker to make classification easier. That doesn’t mean they are bad. They can be useful. Classification makes things usable. But they aren’t magic. Don’t be fooled by a false feeing of precision.
English
2
1
48
12.5K
Remi #Art
Remi #Art@remi_durant·
I've got a new version of my VQGAN notebook almost ready to go, but I could use some help testing it. Anyone got some time to dig through it over the next couple days and help me make sure everything is working?
English
18
4
60
0
Sam Acquaviva
Sam Acquaviva@Sam_Acqua·
where are the all the dancing sharks
English
0
0
3
0