Armaan
6 posts

Armaan retweetledi

🚨 New Research! Accepted at the Actionable Interp Workshop at ICML 2025 @ActInterp !!🚨
Can steering vectors be used to mitigate group bias in transfomer based classification models ?
We find that they are an effective cheap training-free method to mitigate bias post-hoc.

English





