Dolpheyn

8.5K posts

Dolpheyn banner
Dolpheyn

Dolpheyn

@dolpheyn

Systems enjoyer

Katılım Aralık 2022
1.4K Takip Edilen583 Takipçiler
Sabitlenmiş Tweet
Dolpheyn
Dolpheyn@dolpheyn·
Yo this is where I'll write tech stuff (primarily software engineering). I dont have anything to write yet so Imma plug my blog lol Why I love Rust (Oct 2021) dolpheyn.hashnode.dev/3-why-i-love-r…
English
0
11
36
0
Dolpheyn
Dolpheyn@dolpheyn·
Chief Information, Product and Analytics Non-officer (CIPAN)
English
2
1
2
134
Dolpheyn retweetledi
schlauchschal
schlauchschal@schlauchschal·
@meowkoteeq "I'm sorry doctor but i can't take this vaccine, the protein structures needed for its development were calculated by alphafold and that's genai :/"
English
5
69
800
18.8K
Dolpheyn
Dolpheyn@dolpheyn·
@LuqmanRom Uish hahaha kebetulan. Aku tanya gemini 🤣
Dolpheyn tweet media
Indonesia
0
0
0
22
Luqman Rom
Luqman Rom@LuqmanRom·
@dolpheyn Aku rasa dia buat targetted ads ke programmer la. Aku pon nampak. Hahaha
Indonesia
1
0
0
28
Dolpheyn
Dolpheyn@dolpheyn·
Delve helped them find a compliance hack with this one
Kimi.ai@Kimi_Moonshot

Congrats to the @cursor_ai team on the launch of Composer 2! We are proud to see Kimi-k2.5 provide the foundation. Seeing our model integrated effectively through Cursor's continued pretraining & high-compute RL training is the open model ecosystem we love to support. Note: Cursor accesses Kimi-k2.5 via @FireworksAI_HQ ' hosted RL and inference platform as part of an authorized commercial partnership.

English
0
1
3
231
Dolpheyn
Dolpheyn@dolpheyn·
@kentcdodds Did you make claude launch a cursor cloud agent through cloudflare with codemode or what are we looking at here?
English
1
0
11
1.6K
Dolpheyn
Dolpheyn@dolpheyn·
Keeping youtube default to 0.9x speed because it allows me to finish watching vids again
English
0
0
1
36
Dolpheyn retweetledi
radbro
radbro@radbro·
Stepped into a faraday cage and my internal monologue disappeared
English
180
1.3K
23K
389.3K
Dolpheyn retweetledi
can
can@marmaduke091·
🚨 100M TOKEN CONTEXT WITHOUT COLLAPSE > <9% degradation from 16K → 100M > beats RAG + rerank + SOTA pipelines > runs on just 2×A800 GPUs we could be back
can tweet media
艾略特@elliotchen100

论文来了。名字叫 MSA,Memory Sparse Attention。 一句话说清楚它是什么: 让大模型原生拥有超长记忆。不是外挂检索,不是暴力扩窗口,而是把「记忆」直接长进了注意力机制里,端到端训练。 过去的方案为什么不行? RAG 的本质是「开卷考试」。模型自己不记东西,全靠现场翻笔记。翻得准不准要看检索质量,翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理,就抓瞎了。 线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了,但越压越糊,长了就丢。 MSA 的思路完全不同: → 不压缩,不外挂,而是让模型学会「挑重点看」 核心是一种可扩展的稀疏注意力架构,复杂度是线性的。记忆量翻 10 倍,计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」 用了一种叫 document-wise RoPE 的位置编码,让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制,让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录,而是把线索串成链。 结果呢? · 从 16K 扩到 1 亿 token,精度衰减不到 9% · 4B 参数的 MSA 模型,在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属,这是创业公司买得起的成本。 说白了,以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是,让它真正「记住」。 我们放 github 上了,算法的同学不容易,可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA

English
37
130
1.8K
203.5K
Dolpheyn retweetledi
Warfare Analysis
Warfare Analysis@warfareanalysis·
For the first time since 1967. No Eid prayer at Al-Aqsa mosque.
English
116
8.4K
23.7K
1.7M