can

4.9K posts

can

@marmaduke091

software dev, AI insider

Katılım Mart 2017

1.4K Takip Edilen7.3K Takipçiler

can@marmaduke091·4h

🚨 100M TOKEN CONTEXT WITHOUT COLLAPSE > <9% degradation from 16K → 100M > beats RAG + rerank + SOTA pipelines > runs on just 2×A800 GPUs we could be back

艾略特@elliotchen100

论文来了。名字叫 MSA，Memory Sparse Attention。一句话说清楚它是什么：让大模型原生拥有超长记忆。不是外挂检索，不是暴力扩窗口，而是把「记忆」直接长进了注意力机制里，端到端训练。过去的方案为什么不行？ RAG 的本质是「开卷考试」。模型自己不记东西，全靠现场翻笔记。翻得准不准要看检索质量，翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理，就抓瞎了。线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了，但越压越糊，长了就丢。 MSA 的思路完全不同： → 不压缩，不外挂，而是让模型学会「挑重点看」核心是一种可扩展的稀疏注意力架构，复杂度是线性的。记忆量翻 10 倍，计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」用了一种叫 document-wise RoPE 的位置编码，让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制，让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录，而是把线索串成链。结果呢？ · 从 16K 扩到 1 亿 token，精度衰减不到 9% · 4B 参数的 MSA 模型，在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属，这是创业公司买得起的成本。说白了，以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是，让它真正「记住」。我们放 github 上了，算法的同学不容易，可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA

English

822

76K

can@marmaduke091·4h

@Lentils80 Sheeeeeeee

Lentils@Lentils80·4h

@marmaduke091 AGI at 2 cents per 1 billion tokens let's goooo

English

can@marmaduke091·5h

@ChaseBrowe32432 Sensationalist academia, hate to see it

English

249

Chase Brower@ChaseBrowe32432·5h

Opus 4.6 in webui can solve even the "extremely hard" problems btw, not sure what their precise methodology was but they must have heavily hamstrung the models.

Lossfunk@lossfunk

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

English

6.5K

can@marmaduke091·1d

MiniMax M2.7 is out on web! Check it out here: agent.minimax.io

English

59.3K

can@marmaduke091·1d

@zephyr_z9 They seem to be MiMo models linux.do/t/topic/177413…

English

967

Zephyr@zephyr_z9·1d

GLM Most likely

Wildminder@wildmindai

Hunter Alpha beats GLM-5? And getting close to Claude Opus 4.6? No way... This mysterious model with 1M context is showing great results with openclaw and jumped into the top 5. claw-eval.github.io

English

123

20K

can@marmaduke091·2d

@Risichad It doesn't make any sense, my child self would be so excited to see this

English

8.9K

Risichad 🦾@Risichad·2d

@marmaduke091 How can fans of video games and photorealism be against this magical technology ?

English

9.9K

can@marmaduke091·2d

🚨 DLSS 5 will look much better than this! Future of neural rendering will let devs, modders and player configure anything in the game This time, using an AI video model, reimagining GTA 4 in Russia, gives you a feel about how DLSS 5 might look like

English

305

310

4.3K

871.2K

can@marmaduke091·2d

@SaintDeveloper With DLSS 5 it will look pretty insane

English

5.4K

Saint Developer 🇬🇷🇬🇧🇮🇱⛪️@SaintDeveloper·2d

@marmaduke091 if gta6 looks half good as this, i'll play it forever

English

5.8K

can@marmaduke091·2d

@GravyDunkers You need someone with taste to prompt it still

English

12.7K

Rob Chase@GravyDunkers·2d

@marmaduke091 When everything in a game from coding to voice acting to visuals can be configured or reconfigured with AI prompts, why would a profit-driven studio hire for those roles? Who would go to school for those things? What kind of games will we get when they’re 80-90% AI made?

English

14.9K

can@marmaduke091·2d

@lyc_aon Facts

English

22K

lycaon@lyc_aon·2d

@marmaduke091 GTA 4 was such a gem.

English

145

24.6K

can@marmaduke091·2d

@dj_nabs

QME

11.5K

DJ NabS@dj_nabs·2d

@marmaduke091 This looks amazing

GIF

English

115

14.9K

can@marmaduke091·2d

@Baroni_88

GIF

QME

6.5K

𝔉𝔯𝔞𝔫𝔨 𝔑 𝔅𝔞𝔯𝔬𝔫𝔦@Baroni_88·2d

@marmaduke091

GIF

QME

8.2K

can@marmaduke091·2d

@RawWilson1 Exactly

English

15.9K

RAW@RawWilson1·2d

@marmaduke091 I just realized what DLSS5 is trying to say it does, lighting and texture enhancing. Without touching geometry, you can still do a lot to characters/games even if you just touch lighting and textures.

English

18.1K