Ben Geist
706 posts

Ben Geist
@b_geist
Research Eng @ramplabs / physics + math nerd / Kate Bush fan

Uber’s COO has said that it’s getting “harder to justify” its AI costs because there was no way to show a link between AI spend and any meaningful increase in useful features. This is the first time I’ve seen a company say this directly. businessinsider.com/uber-coo-andre…

看起来 @MiniMax_AI M3很快就要来了。工程负责人@SkylerMiao7 之前发的一个技术图中可以看到 MiniMax M3 模型确定将会有百万上下文,采用基于GQA(Grouped Query Attention)的动态块稀疏注意力设计。先用 Index Branch 做粗检索,再用 Sparse Branch 对选中的 block 做真实 attention,它的逻辑是:当前 query 不需要看全部历史,只需要看 top-k 相关历史块。打个比方就是看书时候不是把整本书每一页都重读,而是先快速查目录/索引,定位几个相关章节,再精读。这个设计的效果也很明显,一百万上下文,prefill比之前快9.7倍,decode快15.6倍。期待到时候看看DeepSeek V4 和 Minimax M3 谁才是性价比之王。












AI folks in NYC -- Data Driven NYC (#121) this Tuesday at 6pm. Come meet fellow AI builders and our speakers: * @RampLabs has been cooking lately with a lot of agentic innovation; Alex Levinson will demo * @EstuaryDev provides unified data infra for AI - CEO @dyaffe RSVP: luma.com/ddnyc121







