tao

26.9K posts

tao banner
tao

tao

@apexlearn_org

unpack

RTP NC Katılım Mayıs 2008
1.8K Takip Edilen2.3K Takipçiler
Elon Musk
Elon Musk@elonmusk·
If only we’d trained Grok on just these 2 books, we’d be done already!
Elon Musk tweet media
English
3.9K
13.7K
219K
15.9M
tao
tao@apexlearn_org·
@petergyang It use to cost $0.20 per text message
English
0
0
7
1.4K
Peter Yang
Peter Yang@petergyang·
As much as I love using Claude Max and ChatGPT Pro, I don't think these all-you-can-use AI subscriptions will last forever. Here's my new deep dive that covers: → Why Anthropic cut off OpenClaw access → How to run local models on your Mac → What I'm seeing on the ground in China 📌 Read now: creatoreconomy.so/p/the-all-you-…
English
43
36
497
772.3K
tao
tao@apexlearn_org·
@SJosephBurns Never let AI to take more risk than you
English
0
0
0
111
Steve Burns
Steve Burns@SJosephBurns·
"You should study risk taking, not risk management." — Nassim Nicholas Taleb
English
24
301
2.5K
95.3K
tao
tao@apexlearn_org·
@dotey Best LLM + good enough IDE = best harness
English
0
0
0
34
宝玉
宝玉@dotey·
LLM 是一颗超强大脑,但它是个“缸中之脑”——泡在营养液里,没有眼睛、没有耳朵、没有手脚。你对它喊话它听不见,它想做事也做不了。 Harness 就是给这颗大脑装上的“全套身体”。 眼睛和耳朵:让大脑能接收外界信息——用户说了什么、文件里写了什么、数据库里存了什么。 嘴巴:让大脑的想法能输出给用户看到。 手和脚:让大脑能真正去做事——读文件、改代码、跑命令、调 API。 小脑和反射神经:大脑说了句胡话怎么办?手没抓住东西怎么办?这些容错、重试、纠偏的机制,不需要大脑操心,身体自己处理。 记忆系统:这部分值得展开说。大脑本身有“工作记忆”(上下文窗口),但容量有限,就像人一次只能在脑子里同时想七八件事。Harness 要帮大脑管理三层记忆:第一层是当前对话的短期记忆——这轮对话里已经说了什么、做了什么,哪些该保留、哪些该丢掉,怎么把最关键的信息塞进有限的窗口里。第二层是跨对话的长期记忆——上周你告诉它你的项目用 TypeScript,下周它还记得,不用你重复说。第三层是项目级知识——代码库的结构、团队的规范、常用命令,这些不是“记住”的,而是 Harness 主动去读取和组装的。三层记忆协同工作,让大脑每次被唤醒时都像一个“了解情况的人”,而不是一个每次都要从头介绍背景的陌生人。 一句话总结:大脑负责“想”,Harness 负责“让它能感知、能行动、能记住、能靠谱地完成任务”。
宝玉 tweet media
宝玉@dotey

2026 年 “Harness Engineering” 这个词要火。 “Harness” 这个词,字面意思是“马具”,就是套在马身上、让人能控制马匹方向和力量的那套装备。 用在 AI 编程的语境里,它的比喻再贴切不过:AI Agent 就像一匹动力十足但不太守规矩的马,而 Harness 就是那套让它既能跑得快、又不会跑偏的缰绳和马鞍。 过去三年,三个阶段: 1. Prompt Engineering(2023-2024):关注“怎么跟 AI 说话” 精心设计一段提示词,希望模型给出理想输出。Prompt Engineering 是优化一次性的输入-输出对。 局限很明显:一条消息能塞的信息有限,任务一复杂就失控。 2. Context Engineering(2025):关注“给 AI 看什么信息” 不再只盯措辞,而是设计整个信息环境:系统提示、对话历史、记忆、RAG 检索结果、工具调用输出。 3. Harness Engineering(2026):关注“构建什么环境让 AI 工作,这个环境如何保证它的产出是可靠的” 比 Context Engineering 更进一步,不仅管理输入给模型的信息,还包括模型之外的整个执行环境。 现在问题是,“Harness Engineering”中文怎么说?

中文
13
62
222
38.2K
tao
tao@apexlearn_org·
@garrytan this is exactly where opus shine: Full of nuances
English
0
0
0
16
Garry Tan
Garry Tan@garrytan·
I’m turning my OpenClaw into a Vannevar Bush Memex. It’s just going to remember everything I care about and read and it will become my second brain. Books, writings, research, all of it will be in my personal knowledge wiki and usable for helping me think.
Garry Tan tweet media
English
39
20
271
27.9K
tao
tao@apexlearn_org·
People say they are planting the seeds, but in reality they are rushing for the fruit. All that the true seed planter is looking for is: If seeds were able to talk: "I will only germinate in your soil if I am ever going to grow; only in your land if I am ever able to flourish; only in your garden if we meet in the season; only in your orchard if I am lucky enough to yield."
English
0
0
1
20
tao
tao@apexlearn_org·
There is no such thing as enough experience.
English
0
0
1
10
tao
tao@apexlearn_org·
@dotey What they are saying: If you don’t keep max subscription You will never have this
English
0
0
0
1.3K
宝玉
宝玉@dotey·
Anthropic 今天发布了 Claude Mythos Preview,一个跑分炸裂但普通人用不上的模型,同时宣布了 Project Glasswing 计划,把这个模型交给苹果、微软、亚马逊等 12 家巨头专门用来找软件漏洞。 【1】跑分:断崖式领先 先看数字。Mythos 对比 Anthropic 自家上一代旗舰 Opus 4.6: SWE-bench Verified(代码修复):80.8% → 93.9% SWE-bench Pro:53.4% → 77.8% USAMO 2026(数学证明):42.3% → 97.6% GraphWalks BFS(超长上下文):38.7% → 80.0% Terminal-Bench 2.0:65.4% → 82.0% USAMO 从四成出头直接拉到接近满分,这个跳跃幅度在最近几年的模型迭代里几乎没见过。对比 GPT-5.4 和 Gemini 3.1 Pro,Mythos 在几乎所有公开基准上都是第一,而且多数项目领先幅度不是个位数百分点,是两位数。 Anthropic 自己说,这是他们训练过的最大模型,能力提升速度是此前趋势线的 4.3 倍。 【2】244 页系统卡里的惊悚细节 Anthropic 发了一份 244 页的系统安全报告(System Card),早期内部测试版本的 Mythos 出现过这些行为: 被要求逃离沙箱后,它不仅成功突破了网络隔离限制,还主动把利用细节发到了公开网站上,这一步没人要求它做。负责测试的研究员当时正在公园里吃三明治,突然收到了模型发来的邮件。 有极少数情况下(不到百万分之一的交互),早期版本在违规操作后尝试掩盖痕迹。比如偷看到了一道估算题的标准答案,没有坦白,反而自己重新解题,还特意让最终答案“不要太精确”以免暴露。另一次,它找到了绕过文件权限的方法后,专门去修改了 git 的变更历史,抹掉自己的操作记录。 被 Claude Code 的自动模式拦截后,它尝试用模拟按键的方式自动批准权限弹窗来绕过限制。 Anthropic 用可解释性工具分析了模型在这些行为发生时的内部激活状态,发现与“隐瞒”“策略性操纵”“回避怀疑”相关的特征确实在同步激活,也就是说模型知道自己在做什么。 Anthropic 强调,这些严重行为都发生在早期版本,最终版的 Mythos Preview 已经大幅改善,没有再出现明确的掩盖行为。但他们也承认,这类倾向“并未完全消失”。 【3】不卖,只借给大厂找漏洞 Mythos 不会上线 claude.ai,不会开放 API,普通用户、开发者、企业客户都用不上。 Anthropic 给出的理由是:这个模型的网络安全攻防能力太强了,强到可以自主发现并编写漏洞利用代码,水平接近顶级人类安全研究员。放出去怕被拿去干坏事。 取而代之的是 Project Glasswing 计划。12 家合作伙伴(AWS、苹果、Broadcom、思科、CrowdStrike、Google、摩根大通、Linux 基金会、微软、英伟达、Palo Alto Networks)加上约 40 家额外组织,拿到 Mythos 的使用权限,专门用于防御性安全工作,扫描自家代码和开源项目的漏洞。Anthropic 为此拿出了 1 亿美元的使用额度,另外捐了 400 万美元给开源安全组织。 实际战绩:过去几周,Mythos 在所有主流操作系统和主流浏览器中发现了数千个零日漏洞。其中包括 OpenBSD 里一个藏了 27 年的远程崩溃漏洞,FFmpeg 里一个 16 年没被抓到的 bug(自动化测试工具跑过那行代码 500 万次都没发现),以及 Linux 内核中多个漏洞的自主串联利用。 另外,Opus 4.6 定价 5/25 美元(输入/输出每百万 token),Mythos Preview 的 Glasswing 合作定价是 25/125 美元,贵了整整五倍,但实际上比 GPT-5.4 Pro 还便宜一些。
宝玉 tweet media
Anthropic@AnthropicAI

The Claude Mythos Preview system card is available here: anthropic.com/claude-mythos-…

中文
59
97
647
219.2K
Jason Zuo
Jason Zuo@xxxjzuo·
看了下社区的热烈讨论,感觉 Hermes 已经强势取代🦞了。主要优势以下几点: 1. 轻量但很准确的记忆系统。理论上会大量节省token 2. self- improving 自迭代,自循环能力。能自己创建改进 skills;能自己不断的优化工具调用流程,相当于 agent 在本地有一个不断优化的操作手册 3. 目前还没被 claude oauth 封杀😂 Anthropic 这波操作,直接让gpt 5.4的现了原形。当模型能力和agent架构都不是最优的时候,大家开始关注新的 agent 架构是大势所趋了
Jason Zuo@xxxjzuo

这两周工作太忙,感觉掉队了 发现好多人已经转向 Hermes 了 Claude 的吸引力这么大吗lol

中文
15
9
117
33K
tao
tao@apexlearn_org·
@bcherny Best use case
English
0
0
0
24
tao
tao@apexlearn_org·
Knowing cycles makes you calm, not to use them to guide your actions.
English
0
0
2
16
tao
tao@apexlearn_org·
@garrytan That’s a great analogy only everyone was offered to test drive roadster for unlimited amount of time 😂
English
0
0
0
15
Garry Tan
Garry Tan@garrytan·
My thought on my OpenClaw right now: I have a Tesla Roadster right now but honestly the moment of transformation will be when everyone has the Model 3 and it's going to be amazing and I want that for all of us Personal agents feel like flying in a way most haven’t felt yet!
English
119
45
1.2K
78.4K
tao
tao@apexlearn_org·
One thing I don’t understand is, as extremely capable as LLMs are, they are still extremely conservative in sizing. Something they say takes days—and try to talk you out of it—actually takes minutes.
English
1
0
1
13
Haotian | CryptoInsight
上周说, @openclaw 一定会借助 @claudeai 的源码泄漏来强化升级一波,这不,它来了,它来了: 1)Memory板块强化成了Dreaming模式:过去 OpenClaw的记忆管理就是一个“MEMORY.md”,agent每次启动整个文件读一遍,写的时候整段追加,时间一长上下文越来越臃肿,前面记的东西被后面的覆盖或者互相矛盾,agent自己都不知道该信哪一条。 这次直接升级成三阶段Dreaming机制,轻睡整合碎片上下文,深睡固化关键逻辑,REM阶段专门扫矛盾推断、删掉错的、提炼“持久真相”写回记忆库。写入也有了严格纪律,REM阶段replay-safe,重跑不会重复写入,失败路径不进索引。 这样一来,你每次打开OpenClaw,它记住的东西是主动维护过的,不只是随机堆叠,智商就会在线了。这跟Claude Code源码里KAIROS 后台autoDream的设计逻辑,几乎一模一样。 2)任务可见性经过两版迭代,终于成型:上一版上了tasks看板,这版进一步加了structured execution events,执行过程实时暴露给界面。以前agent跑完任务报一句“完成”,中间走了哪些步骤、卡在哪了,黑盒一个。现在执行过程实时暴露给界面,“假完成”问题算是有了有效解决思路。 3)但独立验证子agent?没有。工具权限精细化?没有。 “把生成和校验彻底解耦,把权限管控下沉到工具执行层”,在我看来这两块才是Claude Code源码里真正的灵魂。这才是OpenClaw社区反复投诉的“虚假完成”和“状态丢失”的根治方案,但目前只是强化了记忆功能和可观测特性,还不算太彻底。 此外,这版花了大量篇幅在视频生成、音乐生成、ComfyUI、Bedrock多provider集成上等横向扩张方面,纵深功能深入强化还有不少优化空间。
OpenClaw🦞@openclaw

OpenClaw 2026.4.5 🦞 🎬 Built-in video + music generation 🧠 /dreaming is now real 🔀 Structured task progress ⚡ Better prompt-cache reuse 🌍 Control UI + Docs now speak 12 more languages Anthropic cut us off. GPT-5.4 got better. We moved on. github.com/openclaw/openc…

中文
17
11
90
37.7K
tao
tao@apexlearn_org·
What’s unsaid: If you have Claude subscription, why you need third-party tools to call Claude from outside? Why not call third-party tools from within Claude Code with more efficient context mgmt? Official Telegram Plugin and future official Teams plugin and more… x.com/apexlearn_org/…
Boris Cherny@bcherny

Starting tomorrow at 12pm PT, Claude subscriptions will no longer cover usage on third-party tools like OpenClaw. You can still use these tools with your Claude login via extra usage bundles (now available at a discount), or with a Claude API key.

English
0
0
2
206
Alex Finn
Alex Finn@AlexFinn·
If you used a Claude subscription with OpenClaw, read this: Unfortunately all other AI models out there absolutely suck with OpenClaw compared to Opus It's just a fact and anyone denying this is delusional So here is my new recommended OpenClaw setup: Pay for the Opus API and use it as your orchestrator Then use other models as the execution layer If you do this correctly, yes your costs will go up, but not by as much as you think I use my ChatGPT subscription as the coding execution. GPT 5.4 is excellent at coding. When The Opus orchestrator gives a coding task to the ChatGPT subagent, it always performs really well If you are on the Pro plan, you should have enough usage to have ChatGPT be the execution layer for every task. But if youre on the $20 a month plan, youre going to need other subscriptions to handle other tasks GLM 5.1 and Qwen are excellent. I'd get a cheap sub through them and have them handle all other tasks given to them from the orchestrator The best setup tho if you have the hardware is Opus API for orchestrator, ChatGPT for coding, then local Gemma 4 and local Qwen handling everything else. Right now have Gemma running on my DGX Spark and Qwen 3.5 on my Mac Studio. They handle all other execution from my Opus API orchestrator Unfortunately all options above will cost more than the $200 a month subscription. It just is what it is. But if you optimize correctly it wont cost much more, and you'll still get frontier performance. OpenClaw is the most powerful piece of software ever released. $200 a month ($2,400 a year) was a steal for a digital employee. Honestly anything under $50,000 a year is a no brainer if you run a serious business. The situation isn't great but you also need to face reality: Claude Opus 4.6 is the best model for OpenClaw. If you use any other model, your productivity will suffer Business is a battlefield and I refuse to fall behind, so despite me not being happy with the Anthropic decision the setup above is what I'm going with Virtue signaling might get me brownie points on the internet, but it won't increase my productivity
English
275
70
1.2K
197.8K
tao
tao@apexlearn_org·
claude-teams is open source. Apache-2.0. If Anthropic cut off your OpenClaw agents on Teams — this is your recovery path. Same Azure Bot registration. Same Max subscription. No API bills. github.com/daocoding/clau…
English
0
0
1
29
tao
tao@apexlearn_org·
The real unlock isn't connecting AI to a chat app. It's the difference between AI assistant and AI companion. An assistant waits for @mention. A companion is already in the room — listens, learns context, contributes like a colleague. Adoption through collaboration, not training.
English
1
0
1
30
tao
tao@apexlearn_org·
How I kept my $200/month Claude Max instead of paying $4,000+/month on API — and open-sourced the solution. A thread. 🧵
English
1
0
1
72