
MFA is a beautiful visualization, but not only that.
It’s a practical tool:
competitive localization and better steering, while revealing that concepts live in regions, not just single directions.
Grateful I got to contribute to this with an awesome team!
Or Shafran@OrShafran
It's time to look past dictionary learning for decomposing LM activations. What happens when we instead leverage local geometry? We find a natural region-based decomposition that yields better steering and localization 🧵 1/
English






