peter kun retweetledi

Can vision transformers learn something useful when you randomly shuffle it's layers during training?
Turns out they can!
We present "LayerShuffle: Enhancing Robustness in Vision Transformers by Randomizing Layer Execution Order" 🧵
arxiv.org/abs/2407.04513
English



































