Kamesh Krishnamurthy retweetledi

immensely proud to share 2 of my bros' work, Hybrid Associative Memories by @kamesh_ai and @leonlufkin
It's basically SSM and Attention merged into one layer, where the attention is also sparse like DSA.
Except you pretrain with the sparsity. It's lowkirk based (1/N) 👇

English


