Sabitlenmiş Tweet

Some say Gated DeltaNet > Mamba-2. Others say Mamba-2 > Gated DeltaNet.
But what if Gated DeltaNet = Mamba-2? 👀
Well maybe not exactly — but with least-squares preconditioning, we show that they reduce to the same recurrence!
We use this lens to design PDN, PGDN, and PKDA: preconditioned delta-style recurrences that outperform their unpreconditioned counterparts at scale 📈
📄 Paper: arxiv.org/abs/2604.21100 w/ @loo_noel @liquidai
💻 Code: github.com/ntumm120/preco…
English




