darren

7.4K posts

darren banner
darren

darren

@darrenangle

language modeler

☯️ 🇺🇲 views mine Katılım Ağustos 2009
2K Takip Edilen2.2K Takipçiler
Sabitlenmiş Tweet
darren
darren@darrenangle·
"everything is going to be ok"
darren tweet media
English
1
5
57
15.5K
James Zhou
James Zhou@jameszhou02·
btw their supabase storage bucket is publicly accessible via any signed url token 😭 exposes: > employee background checks > equity vesting schedules and grant amounts > performance reviews > session tokens for stripe, notion, etc > screenshots below 🧵 i also got access to their notion 😛
James Zhou tweet media
erin griffith@eringriffith

A detailed and brutal look at the tactics of buzzy AI compliance startup Delve "Delve built a machine designed to make clients complicit without their knowledge, to manufacture plausible deniability while producing exactly the opposite." substack.com/home/post/p-19…

English
80
59
1.2K
346.2K
darren
darren@darrenangle·
substrate revelation engineer
English
0
0
1
67
darren retweetledi
darren retweetledi
Jerry Tworek
Jerry Tworek@MillionInt·
AI labs need a wallfacer project. AI researcher not having to explain themselves to anyone. performing seemingly random actions with hidden inscrutable agenda to create a SOTA model in a way no one would deem possible
English
27
13
385
50K
darren
darren@darrenangle·
holy
艾略特@elliotchen100

论文来了。名字叫 MSA,Memory Sparse Attention。 一句话说清楚它是什么: 让大模型原生拥有超长记忆。不是外挂检索,不是暴力扩窗口,而是把「记忆」直接长进了注意力机制里,端到端训练。 过去的方案为什么不行? RAG 的本质是「开卷考试」。模型自己不记东西,全靠现场翻笔记。翻得准不准要看检索质量,翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理,就抓瞎了。 线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了,但越压越糊,长了就丢。 MSA 的思路完全不同: → 不压缩,不外挂,而是让模型学会「挑重点看」 核心是一种可扩展的稀疏注意力架构,复杂度是线性的。记忆量翻 10 倍,计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」 用了一种叫 document-wise RoPE 的位置编码,让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制,让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录,而是把线索串成链。 结果呢? · 从 16K 扩到 1 亿 token,精度衰减不到 9% · 4B 参数的 MSA 模型,在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属,这是创业公司买得起的成本。 说白了,以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是,让它真正「记住」。 我们放 github 上了,算法的同学不容易,可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA

English
0
0
2
454
darren
darren@darrenangle·
low is the way to the upper bright world
max!@maxsloef

@SenSanders senator, when you use the live voice mode, you are chatting with haiku 4.5, and it is not very smart. i would recommend just chatting with opus 4.6 and playing its responses through elevenlabs

English
0
0
2
127
darren retweetledi
Aidan Gomez
Aidan Gomez@aidangomez·
The coolest thing out there right now is just still having empathy and values. Red pilling, vice signaling, OUT. Caring, believing, IN.
English
26
40
524
61.2K
_ - \.
_ - \.@cephaloform·
okay i got like a 12 day run on 9b set on so im gonna read books and watch tv and eat food and things yay
English
1
0
4
82
_ - \.
_ - \.@cephaloform·
model started swearing in the cot
_ - \. tweet media
English
1
0
11
301
darren
darren@darrenangle·
"You realize that you could be creating all of the businesses and projects and art you ever wanted and all you've got to do is put your instructions in the right order and put the nickels in the bag." - @deepfates
🎭@deepfates

You might think the "agents" thing is just coming for software engineers. Yeah, agents write code, code and code sells a bunch of tokens, But most people's work isn't code, it's memos or decks or whatever. Why this is false: Agents can do anything you can do on a computer, and they do it by spending output tokens to write code. The number of keypresses used by a consultant to do a task is not a good measurement of the number of tokens an agent would use. For example: one "deep research" report might be 20 pages of output tokens. But it also might have required more than 20 pages of output tokens to do all the searches, fetches, PDF parsing and interim summaries that you never even see as the user. It also had to input all the tokens of every document it read in searching — likely more than 20 pages, since the point of the report is to collect and summarize this information. So now we're at 3x tokens for the final output. That one report is so cheap, and so fast, then now you can do more research than ever. This is valuable! If your business relies on having good information about the world, you can probably find a way to make more money by doing 3 deep research reports and then synthesizing them. More tokens! Now you've kicked off three deep research reports you deserve a little treat, right? So you fire up your browser agent and tell it go find me some nice linen shirts for summer in my size. Open them in tabs so I can look through. Well your browser agent has to interact with the browser using some kind of tool and you know what that tool is? Code, baby. Tokens. And the tokens are so cheap. You got to understand. We're spending a lot in the aggregate, but in the moment it is "spend a nickel to for 10 minutes of being literally Superman". Like yes I'll just keep spending nickels actually. I will never stop being Superman at that price. All knowledge workers will feel this. A lot of you already do, you're just hiding it from your boss so you can have more free time while "working from home". And maybe it's better to protect yourselves from Jevons as long as possible, because once you get the bug it's hard to stop. You realize that you could be creating all of the businesses and projects and art you ever wanted and all you've got to do is put your instructions in the right order and put the nickels in the bag. I would happily bet against Anthropic's revenue spike being a brief "sugar high". So would most capital allocators! That is because they have already seen that software can eat the world. White collar knowledge work fundamentally changes in the face of agent economics and entirely new forms of knowledge production? It's happened already in finance: high frequency trading. Now it's happening in tech: high frequency software. Then we will have high frequency science, high frequency governance, high frequency engineering, high frequency medicine and high frequency law. Human society is about to be absolutely DDOSed by information at all levels of the stack. Our civilization was never meant to handle this many tokens. If anything can be done on a computer it will be turned into tokens instead of human actions and it will happen faster and in parallel. This stuff works, it is real, it is getting better. It is going to hit economically and socially this year and nobody is ready and I think it is important to start taking it seriously, instead of finding ever more arbitrary reasons to remain in denial.

English
0
1
12
1.6K
darren
darren@darrenangle·
timeline cleanse
darren tweet media
English
0
0
3
85
darren retweetledi
Soren Larson
Soren Larson@hypersoren·
more signal of the coase conjecture many rn feel impatient >they must tokenmaxx else be condemned to the permanent underclass but even labs are skeptical that this tokenmaxxing is for good use coase applied suggests this impatience distorts perception of margin durability
Soren Larson tweet media
Soren Larson@hypersoren

x.com/i/article/2028…

English
1
3
27
4.1K
darren retweetledi
Hamel Husain
Hamel Husain@HamelHusain·
The real AI flex is how many LOC deleted, not added
English
35
15
261
21K
darren
darren@darrenangle·
@jasminewsun @TheAtlantic you can train a model to write good / experimental mfa-level poetry but you kind of have to nuke it’s ability to do anything else
English
0
0
1
92
jasmine sun
jasmine sun@jasminewsun·
somehow the same AIs that can do PhD-level math and superhuman coding can only write as well as “a real poet’s okay poem” (sama’s words, not mine!) I talked to the people training AIs to write about what makes it so hard: new from me for @TheAtlantic: theatlantic.com/technology/202…
jasmine sun tweet mediajasmine sun tweet media
English
21
39
423
64.1K
darren
darren@darrenangle·
‘sharp’ is the new ‘delve’ by the way
English
0
0
1
96