
🇺🇦🇮🇱dmitriy samsonov
18.4K posts

🇺🇦🇮🇱dmitriy samsonov
@d0rc
programming since 1986









Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…




5 minutes ago, @karpathy just dropped karpathy/jobs! he scraped every job in the US economy (342 occupations from BLS), scored each one's AI exposure 0-10 using an LLM, and visualized it as a treemap. if your whole job happens on a screen you're cooked. average score across all jobs is 5.3/10. software devs: 8-9. roofers: 0-1. medical transcriptionists: 10/10 💀 karpathy.ai/jobs

Fanuc has sold over 25 million servo motors worldwide















