

Abhishek Shetty
116 posts

@AShettyV
Incoming Asst prof at @gatech_scs FODSI Postdoctoral Fellow @MIT PhD from @Berkeley_EECS; Ex: Microsoft Research, Apple Apple AI/ML Research Fellow 2023




In this paper, we study the "extended logit matrix" corresponding to an LLM, a sort of multi-token variant of the well-studied "logit matrix": its rows and columns are indexed by sequences and its entries are determined by the LLM's log-probs on the corresponding sequences. 2/n

The coverage principle: How pre-training enables post-training New preprint where we look at the mechanisms through which next-token prediction produces models that succeed at downstream tasks. The answer involves a metric we call the "coverage profile", not cross-entropy.


1/n Now that I have a bit more time, I wanted to share more about this paper and my own thought process. There’s a gap in how we talk about data and behavior in LLMs. On the one hand, we say “data is the driver” and try to interpret what human preferences the data is showing. On the other hand, we keep seeing learned behaviors that don’t seem to be present in the data – at least not in any human-readable way. Examples like subliminal transfer, covert malicious finetuning, and weird generalization have been quite intriguing for me. In this paper, we ask what’s really causing that, and whether there’s a simple, general mechanism behind it.


1/7 Excited about our new paper with @axliu42 @GolowichNoah @AShettyV @nhaghtal and Ankur on how data selection can have wild effects!
