Romin

428 posts

Romin banner
Romin

Romin

@rmirmo

Building agentic harnesses with a focus on backpressure + self-testing loops

Katılım Nisan 2020
588 Takip Edilen45 Takipçiler
Romin
Romin@rmirmo·
@trq212 Let’s gooo this is awesome! Great for extending Skills
English
0
0
0
9
Romin
Romin@rmirmo·
Every day I’m surprised to see how many people are still building apps by copying and pasting code from chat interfaces…
Aaron Levie@levie

We are so unbelievably early with agents right now. The majority of companies aren’t even using coding agents at scale, let alone for the rest of knowledge work. We’re still mostly in the chatbot era of work for most of AI right now. Diffusion of tech takes time, even in the most breakneck of markets, because there are major workflows that need to be reinvented, any regulated or large business has huge governance processes for deploying new tech or agents, data needs to get into well-organized environments, and there’s technical literacy that needs to be established. All things that get solved, but takes time nonetheless. A point of comparison for technology diffusion: in 2010, a time by which every person in silicon valley knew that cloud was the future, AWS revenue was $500 million, Azure had only launched that year, and GCP was called Google App Engine. By 2025, these 3 platforms generated around $225 billion in revenue. And that’s only about 60% of the cloud market. So from the moment the tech industry saw the future of cloud to today, the market is nearly 1,000 times bigger. And it’s still growing at an insane rate. The same will happen for agents. Coding agents are like the early days of cloud computing when developers got on board for initial use cases. Then came the bigger workloads. This gives you a sense for how early we actually are in this transformation.

English
0
0
0
5
Romin retweetledi
Steve Schoger
Steve Schoger@steveschoger·
I put together a one hour video on how I've been using Claude Code as my primary design tool. Packed with tons of 🔥 design tips.
English
113
408
5.3K
684.8K
Romin retweetledi
Thariq
Thariq@trq212·
I put a lot of heart into my technical writing, I hope it's useful to you all. 📌 Here's a pinned thread of everything I've written. (much of this will be posted on the Claude blog soon as well)
English
206
626
6.4K
652.2K
Romin
Romin@rmirmo·
This is so cool! 100M token context window at great retrieval accuracy would be game-changing.
艾略特@elliotchen100

论文来了。名字叫 MSA,Memory Sparse Attention。 一句话说清楚它是什么: 让大模型原生拥有超长记忆。不是外挂检索,不是暴力扩窗口,而是把「记忆」直接长进了注意力机制里,端到端训练。 过去的方案为什么不行? RAG 的本质是「开卷考试」。模型自己不记东西,全靠现场翻笔记。翻得准不准要看检索质量,翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理,就抓瞎了。 线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了,但越压越糊,长了就丢。 MSA 的思路完全不同: → 不压缩,不外挂,而是让模型学会「挑重点看」 核心是一种可扩展的稀疏注意力架构,复杂度是线性的。记忆量翻 10 倍,计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」 用了一种叫 document-wise RoPE 的位置编码,让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制,让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录,而是把线索串成链。 结果呢? · 从 16K 扩到 1 亿 token,精度衰减不到 9% · 4B 参数的 MSA 模型,在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属,这是创业公司买得起的成本。 说白了,以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是,让它真正「记住」。 我们放 github 上了,算法的同学不容易,可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA

English
0
0
0
14
Romin retweetledi
艾略特
艾略特@elliotchen100·
论文来了。名字叫 MSA,Memory Sparse Attention。 一句话说清楚它是什么: 让大模型原生拥有超长记忆。不是外挂检索,不是暴力扩窗口,而是把「记忆」直接长进了注意力机制里,端到端训练。 过去的方案为什么不行? RAG 的本质是「开卷考试」。模型自己不记东西,全靠现场翻笔记。翻得准不准要看检索质量,翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理,就抓瞎了。 线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了,但越压越糊,长了就丢。 MSA 的思路完全不同: → 不压缩,不外挂,而是让模型学会「挑重点看」 核心是一种可扩展的稀疏注意力架构,复杂度是线性的。记忆量翻 10 倍,计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」 用了一种叫 document-wise RoPE 的位置编码,让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制,让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录,而是把线索串成链。 结果呢? · 从 16K 扩到 1 亿 token,精度衰减不到 9% · 4B 参数的 MSA 模型,在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属,这是创业公司买得起的成本。 说白了,以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是,让它真正「记住」。 我们放 github 上了,算法的同学不容易,可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA
艾略特 tweet media
艾略特@elliotchen100

稍微剧透一下,@EverMind 这周还会发一篇高质量论文

中文
159
549
3.1K
1.5M
Romin
Romin@rmirmo·
@Yampeleg I think it’s only Opus thats down.
English
0
0
0
75
Yam Peleg
Yam Peleg@Yampeleg·
claude code is down how’s everyone doing
English
50
5
105
11K
Romin
Romin@rmirmo·
The policy toolkit is much thinner for a physical commodity crisis than a financial one.
English
0
0
0
28
Romin retweetledi
Dan Loewenherz
Dan Loewenherz@dwlz·
It appears I called the bottom. Software engineer job postings are up massively YoY, while overall job postings went down. I said it in May 2025 and I'll say it again: I've never been more bullish on software engineering as a profession. But at that time (and still today!), many were saying that this profession was going away. It was a bad take then and even more bad now. AI is a tool that amplifies productivity: a lever doesn't move on its own, no matter how good that lever is, whether it's just really good tab completion or a cloud agent. Even agents need to be told what to do. This industry in aggregate is going to be able to satisfy needs and use cases that have never even been considered due to time or resource constraints. The future is bright for builders. Probably has never been brighter.
Dan Loewenherz tweet media
Dan Loewenherz@dwlz

I'm squarely in the camp of believing that there is an insatiable demand for software, and until that stops, software engineers are going to be in overwhelmingly short supply in the long term. AI is forcing companies to recalibrate their resources (we're all seeing that now with layoffs and freezes), but once AI is more-or-less saturated among most companies, hiring more engineers will, once again, be the only way to build more product than the competition. And yes, being better than the competition is all that matters. Not being better than the competition 2 years ago. The pre-AI world is gone and not coming back. Readjust your expectations. Companies that think they'll be able to last with tiny headcounts are going to get crushed by the companies who choose to grow and force AI adoption across large teams. Yes, there's a brief period where adoption hasn't fully hit yet, but once it has, the low water mark for productivity will adjust and you still need more people to build more product, unless you have some secret AI sauce that no one else has access to (indie solopreneur building saas products, that's not you!).

English
63
96
1.2K
124.2K
Romin
Romin@rmirmo·
@AdamHoltererer Unfortunately all the websites being built with the frontend design skill tend to look identical lol. Better to provide @variantui samples as design references.
English
1
0
6
2K
Adam Holter
Adam Holter@AdamHoltererer·
Gemini 3.1 Pro Frontend Test: Without Skill vs. With Skill
Adam Holter tweet mediaAdam Holter tweet media
English
37
14
1.1K
136.7K
Philipp Schmid
Philipp Schmid@_philschmid·
Gemini 3.1 Pro Update! A upgrade to our best coding and agentic gemini model! 🚀 Here is all you need to know: - Same 1M context with 64k output, knowledge cut of Jan 2025. - Same $2 / $12 (<200k tokens); $4 / $18 (>200k tokens). - 2.5x better abstract reasoning (77.1% on ARC-AGI-2 vs 31.1% for 3 Pro). - 82% better agentic tool use (33.5% on APEX-Agents vs 18.4% for 3 Pro). - #1 on MCP Atlas (69.2%) and BrowseComp (85.9%). - SWE-Bench Verified (80.6%), Terminal-Bench 2.0 (68.5%). Gemini 3.1 Pro is available in @GoogleAIStudio and Gemini API, @GeminiApp, @googlecloud and Google @Antigravity.
Philipp Schmid tweet media
English
29
39
612
54.3K
Romin
Romin@rmirmo·
Just built a tool that lets me convert any HTML/CSS (such as AI generated landing pages) into Webflow JSON. So I can just copy an entire AI-generated high-fidelity prototype and paste it into Webflow. Variables linking and everything. Huge time unlock on client builds.
English
0
0
3
113
Romin
Romin@rmirmo·
@burkov I'm still on GPT-5.2 High. I find it's better than GPT-5.3-Codex-High.
English
0
0
0
57
BURKOV
BURKOV@burkov·
GPT-5.3-Codex is only borderline competitive with Claude Opus 4.6 when you set the reasoning effort to Extra High. Any value below Extra High just results in unforgivably dumb decisions and argumentation. And this attitude. This fucking attitude of an autistic misanthrope. It’s better not to ask this jerk any questions at all.
English
72
9
251
37.2K
Julian Galluzzo
Julian Galluzzo@galluzzo_julian·
I LOVE Claude Code. It's all I use. But... many of the best developers I know have been raving about how good Codex is. Not to mention... it's way more affordable. So - you can now use Codex in @shipstudio_app! Live now 🚀🚀🚀
Julian Galluzzo tweet media
English
3
2
12
575