Vitória Pacela

120 posts

Vitória Pacela banner
Vitória Pacela

Vitória Pacela

@vpacela

PhD student @Mila_Quebec, @UMontreal. Previously: @AIatMeta, @helsinkiuni. She/elle/ela.

Montréal Katılım Eylül 2010
1.2K Takip Edilen703 Takipçiler
Vitória Pacela retweetledi
David Klindt
David Klindt@klindt_david·
So excited to finally share this! Linear probes often outperform SAEs, especially out-of-distribution (OOD). @thesubhashk @JoshAEngels et al showed this convincingly (arxiv.org/abs/2502.16681). This prompted @NeelNanda5 and others to de-emphasize SAE research. Empirically, fair enough. But we think the theoretical case for dictionary learning was dismissed too quickly. @oneill_c previously showed SAEs can't do proper sparse coding (arxiv.org/abs/2411.13117). @shruti_joshi @vpacela and @isacama_phys took this further and showed how this leads to problems particularly in OOD settings. So the issue may not be with dictionary learning itself, but with the current tools. Here's the core argument: if neural representations are in superposition, i.e. more features than dimensions encoded linearly (arxiv.org/abs/2503.01824), then linear probes fundamentally cannot be the answer. This is a compressed sensing problem. There's a linear measurement (the representation) and a nonlinear inference procedure (like an SAE encoder) that recovers the higher-dimensional sparse signal. Linear algebra tells us error-free recovery is impossible if decoding is restricted to be linear. (but see this cool work if errors are acceptable arxiv.org/abs/2602.11246) Check out our video: We have some neat demonstrations here. A linear decision boundary in 3D becomes nonlinear in 2D, even though all sparse combinations of latents remain distinguishable. Compressed sensing works: we can, in principle, recover the high-dimensional latent space where linear probes work and generalize OOD. Where does this leave us? With finite data and millions of concepts, simpler methods may perform better for a while. But if we want interpretability and safety methods that work OOD, especially compositional generalization covering all possible jailbreaks and real-world failures, we'll have to build bottom up from the right theory. @kennylpeng @thebasepoint @tegmark @yash_j_sharma @woog09 @livgorton @EkdeepL @thomas_fel_ @nsaphra
Shruti Joshi@_shruti_joshi_

SAEs fail at OOD tasks. Why? Features in superposition are linearly representable but not linearly accessible. Instead of discarding sparse coding, we embrace the geometry of superposition and use methods equipped to handle the nonlinearity it induces.

English
4
39
263
27.3K
Vitória Pacela
Vitória Pacela@vpacela·
Excited about this collaboration! We share some insights on why linear probes don't generalize under the linear representation hypothesis and expand on why SAEs are still not enough for compositional OOD generalization under superposition/overcompleteness. 👇
Shruti Joshi@_shruti_joshi_

SAEs fail at OOD tasks. Why? Features in superposition are linearly representable but not linearly accessible. Instead of discarding sparse coding, we embrace the geometry of superposition and use methods equipped to handle the nonlinearity it induces.

English
0
5
20
2.2K
Vitória Pacela retweetledi
Shruti Joshi
Shruti Joshi@_shruti_joshi_·
Mechanistic interpretability aims to understand models — and the more superhuman or incoherent they become, the more we need that understanding to be reliable. We propose a framework for this, drawing on established tools from causal reasoning and statistical identifiability: 🧵
English
3
16
113
37.9K
Vitória Pacela retweetledi
Pope Leo XIV
Pope Leo XIV@Pontifex·
Technological innovation can be a form of participation in the divine act of creation. It carries an ethical and spiritual weight, for every design choice expresses a vision of humanity. The Church therefore calls all builders of #AI to cultivate moral discernment as a fundamental part of their work—to develop systems that reflect justice, solidarity, and a genuine reverence for life.
English
2.1K
4.9K
33.8K
5.5M
Vitória Pacela retweetledi
Andrew Gordon Wilson
Andrew Gordon Wilson@andrewgwils·
Bach is so timeless because he wasn't writing for people, he was writing for a higher power. Try writing your next paper for God. Imagine how many rubbish papers we wouldn't see anymore. Your audience sees your every thought and intention. There would be no ego, no pretense.
English
5
21
286
36.2K
Mo Samsami
Mo Samsami@M_R_Samsami·
This week I joined @GoogleDeepMind as a research engineer on the Reinforcement Learning Engineering team. Continuing my focus on world models, I begin with Genie. A lot to learn, a lot to contribute!
English
39
20
1.1K
70.4K
Vitória Pacela retweetledi
Divyat Mahajan
Divyat Mahajan@divyat09·
Happy to share that Compositional Risk Minimization has been accepted at #ICML2025 📌Extensive theoretical analysis along with a practical approach for extrapolating classifiers to novel compositions! 📜 arxiv.org/abs/2410.06303
Divyat Mahajan tweet media
English
5
30
168
19.6K
Vitória Pacela
Vitória Pacela@vpacela·
@gene_is_here @FrnkNlsn I don't think the point of the book is to give emotions to AI, it's rather a computational theory for suffering. The author states explicitly that the theory will not cover "social-driven" suffering. What biases are you referring to?
English
0
0
0
27
Frank Nielsen
Frank Nielsen@FrnkNlsn·
Very refreshing reading, free online 1/2
Frank Nielsen tweet media
English
19
344
2.7K
320.3K
Vishal Jogdand
Vishal Jogdand@itsvishalpj·
@FrnkNlsn Ohh, great concept, but suffering is never ending part of human.. until someone understands the Buddhist theory of suffering.
English
1
0
2
2K
Vitória Pacela
Vitória Pacela@vpacela·
@PietroGuccione @FrnkNlsn I wouldn't be so quick to criticize or call it useless. The book is far from shallow. The author is simply explicitly stating that the characterization of suffering will not cover social sources, but it gets very deep into the rest.
English
1
0
0
25
Pietro Guccione 💔
Pietro Guccione 💔@PietroGuccione·
@FrnkNlsn "...suffering is mainly caused by frustration, which is the failure of an agent to achieve a goal...". Sorry, human suffering is more complex than this. Useless book if the premise is this one.
English
1
0
4
367
Vitória Pacela retweetledi
Reyhane Askari
Reyhane Askari@ReyhaneAskari·
Can't say enough of how much I have enjoyed working with Adriana and Michal and the FAIR team in the past 2.5 years. If you have background in generative modeling/diffusion/flows. I definitely recommend applying.
Adriana Romero-Soriano@adri_romsor

We're looking for a postdoc to work with us in FAIR Montreal @AIatMeta. Interested in building generative visual models of the world and leveraging them to train dowsntream ML models? Apply: metacareers.com/jobs/376087892… cc:@hall__melissa @ReyhaneAskari @JakobVerbeek @michal_drozdzal

English
0
4
24
1.8K