Tensor Cruncher

278 posts

Tensor Cruncher banner
Tensor Cruncher

Tensor Cruncher

@tensorcruncher

Mechanistic Interpretability and Model Internals Other interests: Math, Systems, Music

Mumbai Katılım Aralık 2025
415 Takip Edilen19 Takipçiler
Tensor Cruncher
Tensor Cruncher@tensorcruncher·
On chapter 7 of @rasbt's book, "Build a LLM from scratch". I really appreciate that he has such diagrams spread across the book. Helps to situate yourself when learning.
Tensor Cruncher tweet media
English
0
0
0
20
Tensor Cruncher retweetledi
Sayak Paul
Sayak Paul@RisingSayak·
Expansion is a good thing, and may it never run out! Plus it's Japan 🧨
Sayak Paul tweet media
English
8
3
55
10.5K
Tensor Cruncher
Tensor Cruncher@tensorcruncher·
Interesting to read about the journey from skip connections to manifold constrained hyperconnections. It's like we're seeing the early car / plane being developed in front of our eyes with each component going through rapid evolution.
English
0
0
1
15
Tensor Cruncher
Tensor Cruncher@tensorcruncher·
This is legit one of the best resources to understand what the attention mechanism is. The approach taken by @rasbt of first introducing a naive implementation really helps. Then adding trainable weights, masking, dropout and multiple heads one step at a time.
Tensor Cruncher@tensorcruncher

On chatper 3 "Coding Attention Mechanisms" of @rasbt's book. Look forward to diving into a few @huggingface transformer model implementations and the nanochat repo after this book!

English
0
0
1
52
Tensor Cruncher
Tensor Cruncher@tensorcruncher·
Studying the attention mechanism makes me feel like 🥰⭐️🚀
English
0
0
0
19
Tensor Cruncher retweetledi
Goodfire
Goodfire@GoodfireAI·
Introducing Silico: the platform for building AI models with the precision of written software. Silico lets researchers and engineers see inside their models, debug failures, and intentionally design them from the ground up. Early access is open now. 🧵(1/10)
English
20
114
871
109.5K
Tensor Cruncher
Tensor Cruncher@tensorcruncher·
On chatper 3 "Coding Attention Mechanisms" of @rasbt's book. Look forward to diving into a few @huggingface transformer model implementations and the nanochat repo after this book!
Tensor Cruncher tweet media
English
0
0
2
78
Tensor Cruncher
Tensor Cruncher@tensorcruncher·
Got this book from @BlueDotImpact for submitting the application to their Technical AI Safety program. Hope to get in!
Tensor Cruncher tweet media
English
0
0
1
27
Tensor Cruncher
Tensor Cruncher@tensorcruncher·
“Different modalities have different sparsities / geometries”
Sonia Joseph@soniajoseph_

Interpretability is built on a few core assumptions. Two of our ICLR 2026 @iclr_conf papers suggest some of those assumptions are wrong (or at least highly incomplete). 1. Sparse CLIP: Co-Optimizing Interpretability and Performance in Contrastive Learning arxiv.org/abs/2601.20075 much of the field has internalized an interpretability–accuracy trade-off: if you want cleaner, more human-understandable features, you sacrifice performance. however, we find that this trade-off is not fundamental. instead of relying on post-hoc methods (e.g. sparse autoencoders trained on frozen representations), we incorporate sparsity directly into CLIP training. surprisingly, this produces features that are significantly more interpretable while preserving downstream performance. this result made me more optimistic about intrinsically interpretable models, a direction that was imo written off too early. - 2. Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry arxiv.org/abs/2510.08638 a lot of interpretability work implicitly assumes that vision representations behave like language: sparse, linear, and decomposable into independent features. we find that this assumption is often misleading. instead, vision representations appear partially dense and geometrically structured. we propose the Minkowski Representation Hypothesis: tokens live in sums of convex regions formed from a small set of “archetypes,” rather than isolated features along linear directions. this reframes how different tasks (classification, segmentation, depth) recruit and organize concepts. it also suggests that many current interpretability tools are mismatched to the actual structure of vision data. -- tldr; interpretability can be built into training with surprisingly simple tweaks, and that different modalities have different sparsities/geometries. Tailoring the interp method to the modality is super impt!

English
0
0
0
32
Tensor Cruncher
Tensor Cruncher@tensorcruncher·
@ariG23498 @RisingSayak I can’t imagine who that could be. Surely not someone who found themselves on Indian national news for a tweet about traffic 👀
English
0
0
1
32
Aritra 🤗
Aritra 🤗@ariG23498·
[Hugging Face ML Club India] We are beyond excited for the next virtual event. We host an incredible researcher and more than that an idol of mine (pretty sure of @RisingSayak's as well). They will be talking about the slow death of scaling. I am pretty sure you know who that is, but more information coming soon. Keep your eyes glued to this space. 🤗
English
16
4
177
8.1K
Tensor Cruncher retweetledi
elvis
elvis@omarsar0·
A nice paper worth checking out. (bookmark it) For a long time, we have had machines that work astonishingly well before we had a real theory of why. This paper argues that the scattered pieces are becoming something like a mechanics of learning, with solvable toy worlds, scaling laws, tractable limits, hyperparameter theories, and universal behaviors starting to line up. The strange thing is that neural networks are not opaque in the way nature is opaque. We can inspect every weight, gradient, activation, and loss. The challenge is not merely access to the details. It is finding the right level of abstraction, where enough detail is discarded for understanding to become possible. That is why the physics analogy is useful. Physics often works by giving up on exact microscopic description and finding the right aggregate variables. Pressure, temperature, momentum. Maybe loss landscapes, sharpness, feature formation, scaling exponents, and training dynamics are playing a similar role here. I am cautious about grand unifying language in AI, but I think we are onto something as a field, and it's an exciting time to be a researcher in the space.
Jamie Simon@learning_mech

1/ Deep learning is going to have a scientific theory. We can see the pieces starting to come together, and it's looking a lot like physics! We're releasing a paper pulling together these emerging threads and giving them a name: learning mechanics. 🔨 arxiv.org/pdf/2604.21691 🔧

English
6
23
85
20.8K
Tensor Cruncher retweetledi
Leon Lang
Leon Lang@Lang__Leon·
The Iliad Intensive is a new full-time month-long course on foundational AI alignment. What should you expect? 🧵
Leon Lang tweet media
English
1
7
86
7.8K
Ryan Kidd
Ryan Kidd@ryan_kidd44·
Incredible turnout at the @MATSprogram ICLR 2026 booth and mixer! 230 people registered for our mixer, with 157 waitlisted. AI safety & security has a strong future!
Ryan Kidd tweet mediaRyan Kidd tweet media
English
4
5
112
5.6K