Zamantikazamantika
Trending Arsip Tweet Blog

Post

Kimi.ai
Kimi.ai@Kimi_Moonshotยท2d
๐Ÿ“ŽWe've uploaded it to arXiv, enjoy! arxiv.org/abs/2603.15031
Kimi.ai tweet media
Kimi.ai@Kimi_Moonshot

Introducing ๐‘จ๐’•๐’•๐’†๐’๐’•๐’Š๐’๐’ ๐‘น๐’†๐’”๐’Š๐’…๐’–๐’‚๐’๐’”: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. ๐Ÿ”น Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. ๐Ÿ”น Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. ๐Ÿ”น Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. ๐Ÿ”น Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. ๐Ÿ”—Full report: github.com/MoonshotAI/Attโ€ฆ

English
61
292
3.1K
192.9K
Danav
Danav@Randomfluxยท2d
@Kimi_Moonshot Hitting on the nail. Nice work
English
0
0
0
165
Bagikan
Zamantikazamantika - Mersobahis - Locabet

Lihat profil, tweet, dan tren Twitter/X secara anonim. Tanpa perlu akun.

Navigasi

  • Beranda
  • Trending
  • Arsip Tweet
  • Blog
  • Tentang
  • Kontak

Profil Populer

  • @elonmusk
  • @BarackObama
  • @taylorswift13
  • @cristiano
  • @NASA

Hukum

  • Ketentuan Layanan
  • Kebijakan Privasi

ยฉ 2025 Zamantika. Semua hak dilindungi.

zamantika.com