retto
547 posts

retto
@rettooooo
software engineer, building things. founder @Vidbytee



Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…





Thank god MCP is dead Just as useless of an idea as LLMs.txt was It's all dumb abstractions that AI doesn't need because AI's are as smart as humans so they can just use what was already there which is APIs

Introducing the new /crawl endpoint - one API call and an entire site crawled. No scripts. No browser management. Just the content in HTML, Markdown, or JSON.










