can

4.9K posts

can banner
can

can

@marmaduke091

software dev, AI insider

Katılım Mart 2017
1.4K Takip Edilen7.3K Takipçiler
can
can@marmaduke091·
🚨 100M TOKEN CONTEXT WITHOUT COLLAPSE > <9% degradation from 16K → 100M > beats RAG + rerank + SOTA pipelines > runs on just 2×A800 GPUs we could be back
can tweet media
艾略特@elliotchen100

论文来了。名字叫 MSA,Memory Sparse Attention。 一句话说清楚它是什么: 让大模型原生拥有超长记忆。不是外挂检索,不是暴力扩窗口,而是把「记忆」直接长进了注意力机制里,端到端训练。 过去的方案为什么不行? RAG 的本质是「开卷考试」。模型自己不记东西,全靠现场翻笔记。翻得准不准要看检索质量,翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理,就抓瞎了。 线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了,但越压越糊,长了就丢。 MSA 的思路完全不同: → 不压缩,不外挂,而是让模型学会「挑重点看」 核心是一种可扩展的稀疏注意力架构,复杂度是线性的。记忆量翻 10 倍,计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」 用了一种叫 document-wise RoPE 的位置编码,让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制,让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录,而是把线索串成链。 结果呢? · 从 16K 扩到 1 亿 token,精度衰减不到 9% · 4B 参数的 MSA 模型,在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属,这是创业公司买得起的成本。 说白了,以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是,让它真正「记住」。 我们放 github 上了,算法的同学不容易,可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA

English
10
58
822
76K
Chase Brower
Chase Brower@ChaseBrowe32432·
Opus 4.6 in webui can solve even the "extremely hard" problems btw, not sure what their precise methodology was but they must have heavily hamstrung the models.
Lossfunk@lossfunk

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

English
6
0
41
6.5K
can
can@marmaduke091·
MiniMax M2.7 is out on web! Check it out here: agent.minimax.io
can tweet media
English
7
6
89
59.3K
can
can@marmaduke091·
@Risichad It doesn't make any sense, my child self would be so excited to see this
English
7
0
35
8.9K
Risichad 🦾
Risichad 🦾@Risichad·
@marmaduke091 How can fans of video games and photorealism be against this magical technology ?
English
19
1
40
9.9K
can
can@marmaduke091·
🚨 DLSS 5 will look much better than this! Future of neural rendering will let devs, modders and player configure anything in the game This time, using an AI video model, reimagining GTA 4 in Russia, gives you a feel about how DLSS 5 might look like
English
305
310
4.3K
871.2K
can
can@marmaduke091·
@SaintDeveloper With DLSS 5 it will look pretty insane
English
0
0
1
5.4K
can
can@marmaduke091·
@GravyDunkers You need someone with taste to prompt it still
English
5
0
15
12.7K
Rob Chase
Rob Chase@GravyDunkers·
@marmaduke091 When everything in a game from coding to voice acting to visuals can be configured or reconfigured with AI prompts, why would a profit-driven studio hire for those roles? Who would go to school for those things? What kind of games will we get when they’re 80-90% AI made?
English
6
1
66
14.9K
RAW
RAW@RawWilson1·
@marmaduke091 I just realized what DLSS5 is trying to say it does, lighting and texture enhancing. Without touching geometry, you can still do a lot to characters/games even if you just touch lighting and textures.
English
4
0
31
18.1K
can
can@marmaduke091·
@techmin651 yeah they are helpless
English
0
0
7
10.6K
Bob Derek
Bob Derek@techmin651·
@marmaduke091 people are to retarded to imagine it just like their where when dlss 1 progressed to dlss 2
English
3
0
50
11.4K
can
can@marmaduke091·
How DLSS 5 might look like in GTA 6 Created by NBP
can tweet media
English
51
9
686
55.3K
can
can@marmaduke091·
@PragmaticAI_ DLSS 5 in GTA 6 will feel like real life
English
11
0
28
10.8K