Okey Uzukwu
2.5K posts



New info seemingly from DeepSeek staff - A new, much larger base model will be released soon - DeepSeek V3.2 seems to be a larger model than the one currently deployed on the web I used Gemini for translation





Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…


@bindureddy Google will win the AI race in the West, China on Earth and SpaceX in space


Trump briefed that Iran's new supreme leader Mojtaba Khamenei is probably gay - and president has priceless reaction trib.al/MugVpH6











After getting rich, what next?


25 years apart… how easily Americans forget.

Google uses Go. Meta uses Go. Microsoft uses Go. Amazon uses Go. Uber uses Go. Dropbox uses Go. Cloudflare uses Go. Twitch uses Go. Docker uses Go. Kubernetes uses Go. PayPal uses Go. Shopify uses Go. What’s stopping you from learning Go?







