Eric Todd
158 posts

Eric Todd
@ericwtodd
Computer Science PhD Student at Northeastern University

Neural networks might speak English, but they think in shapes. Understanding their rich *neural geometry* is key to understanding how they work – and to debugging and controlling them with precision. Starting today, we’re releasing a series of posts on this research agenda. 🧵

Neural networks do math by rotating shapes. We found a shape-rotating calculator hidden inside an LLM – and it’s used for more than just math! (1/6)










Can you solve this algebra puzzle? 🧩 cb=c, ac=b, ab=? A small transformer can learn to solve problems like this! And since the letters don't have inherent meaning, this lets us study how context alone imparts meaning. Here's what we found:🧵⬇️







💡New paper! Woke up to @AnthropicAI's emotion paper and realized - “wait, that's our finding too.” So we ArXiv'd immediately. We concurrently uncovered a circular geometry of emotions organized by valence and arousal (VA), as well as steering effects on downstream behaviors like refusal and sycophancy. We further provide a mechanistic account for why: refusal and compliance tokens occupy distinct regions in this space. 1/



Why do distilled diffusion models generate similar-looking images? 🤔 Our Diffusion Target (DT) visualization reveals the secret to diversity. It is the very first time-step! And—there is a simple, training-free way to make them more diverse! Here is how: 🧵👇





