Satwik Bhattamishra

250 posts

Satwik Bhattamishra

@satwik1729

CS PhD student at Oxford | Worked at Google, Cohere, and Microsoft Research

Oxford, England Katılım Aralık 2019

808 Takip Edilen839 Takipçiler

Sabitlenmiş Tweet

Satwik Bhattamishra@satwik1729·1d

Given black-box access to a Transformer's output, can we efficiently recover its parameters? We analyse the learnability of attention-based models with query access in our new work. Accepted at #ICML2026 🎉 Work done with @shahkulin98, @mhahn29 and Varun Kanade. 🧵

English

137

13.3K

Satwik Bhattamishra@satwik1729·17h

@AiDevCraft @shahkulin98 @mhahn29 I suppose it should be possible to have an O(rd) algorithm for the noisy oracle regime but we haven't shown it

English

AiDevCraft@AiDevCraft·19h

The multi-head non-identifiability result is doing double duty here — it's a learnability obstacle in the paper, but it's also the natural model-extraction defense commercial APIs implicitly rely on. Does the O(rd) compressed-sensing speedup survive the noisy-oracle regime when r is small?

English

111

Satwik Bhattamishra@satwik1729·1d

English

137

13.3K

Satwik Bhattamishra@satwik1729·17h

@_Suresh2 @shahkulin98 @mhahn29 Query access typically means just input-output pairs and not any other intermediate representations. The difference from traditional setting being that the learner can decide which inputs it wants the labels for rather than getting random labelled examples

English

Suresh@_Suresh2·23h

@satwik1729 @shahkulin98 @mhahn29 does query access mean just input-output pairs or intermediate states too?

English

Satwik Bhattamishra@satwik1729·17h

@E_FutureFan @shahkulin98 @mhahn29 Apart from that, practical APIs are for language models whereas we consider regressors and classifiers to begin with. Right now, the results are of theoretical interest and hopefully serve as stepping stones for more practically relevant algorithms.

English

Satwik Bhattamishra@satwik1729·17h

@E_FutureFan @shahkulin98 @mhahn29 Hey, thanks for the question. While security was one of the motivations, our current results do not have any immediate consequences for practical models since our results are for single head attention and one layer models whereas practical models are multilayer multihead models.

English

Satwik Bhattamishra@satwik1729·1d

We believe there are several open directions around this problem, including multi-head attention, identifiability, and other formulations of query learning. Check out the paper for more details: arxiv.org/abs/2601.16873

English

268

Satwik Bhattamishra@satwik1729·1d

Lastly, the multi-head problem appears more difficult. Multi-head attention is not identifiable in the same sense as single-head attention, and query learning it would require additional structural assumptions. We discuss some possible proof directions in the work.

English

270

Satwik Bhattamishra@satwik1729·6d

@DamienTeney @mhahn29 The experiments in the paper explore that, though only for models generating small regular languages. For more involved or realistic tasks, one would need a more efficient algorithm.

English

Satwik Bhattamishra@satwik1729·6d

@DamienTeney @mhahn29 For example, one could use algorithms for this kind of problem to check whether a language model can generate an undesirable string or pattern with non-negligible probability, such as a password, secret key, offensive word, etc.

English

Satwik Bhattamishra@satwik1729·22 Mar

Given access to a language model, can we extract an interpretable object like a DFA that captures which strings a language model is likely to generate? Our new work on automata learning theory studies this question. To be presented at ##ICLR2026 🎉

English

10.8K

Satwik Bhattamishra retweetledi

Yash Sarrof@yashYRS·30 Nis

In principle, CoT makes Transformers Turing Complete, but empirically LLMs struggle at longer lengths. In our paper, we study Transformer+CoT length generalization and prove that with a finite vocab, models can't solve problems beyond the restricted class TC0. But there’s a fix🧵

English

14.5K

Satwik Bhattamishra retweetledi

Charlie London@CharlieLondon02·16 Nis

We've just released a new benchmark that aims to test models' underlying long-horizon reasoning capabilities. This is very hard to do directly, as it is expensive, time-consuming, and can have many confounding factors.

Sumeet Motwani@sumeetrm

We’re releasing LongCoT, an incredibly hard benchmark to measure long-horizon reasoning capabilities over tens to hundreds of thousands of tokens. LongCoT consists of 2.5K questions across chemistry, math, chess, logic, and computer science. Frontier models score less than 10%🧵

English

2.2K

Satwik Bhattamishra retweetledi

Sid@sid_srk·9 Nis

Come work with me in Toronto, Ontario, Canada on a new kind of social software. More details here

Ambition@ambitionlabsinc

We're looking for a founding ML engineer in Toronto. You'll have a lot of autonomy and compute to make a new genre of social software.

English

185

29.8K

Satwik Bhattamishra retweetledi

Michael Rizvi-Martel@frisbeemortel·9 Nis

Latent CoT is an alternative LLM reasoning scheme hypothesized to enable “superposition” allowing models to hold uncertainty over multiple concepts during reasoning 💭 We revisit superposition in 3 latent CoT approaches and find that it is largely an illusion 🔮! More in 🧵

English

168

14K

Satwik Bhattamishra retweetledi

Michael Hahn@mhahn29·2 Nis

We have 1-2 more extra spots due to new funding -- apply by end of April!

Michael Hahn@mhahn29

We’re hiring PhD students and postdocs on LLM theory and interpretability! Topics: 1️⃣ abilities & limitations of transformers and other architectures; 2️⃣ LLM interpretability; 3️⃣ foundations of LLM reasoning; 4️⃣ foundations of AI safety.

English

160

28.2K

Satwik Bhattamishra retweetledi

Yash Sarrof@yashYRS·23 Mar

Most work on Transformer length generalization assumes a fixed vocabulary. But in real tasks, longer inputs may have new symbols (e.g. more objects in planning). Our new paper introduces C-RASP* to study this and explains the inconsistent performance of Transformers in planning.

English

9.5K

Keşfet

@AiDevCraft @shahkulin98 @mhahn29 @_Suresh2 @E_FutureFan @DamienTeney @elonmusk @BarackObama