ck1521

2K posts

ck1521

ck1521

@ck1523

30↑ 中国人、日本語下手 ゲーム好き(エロゲーも)

Katılım Kasım 2011
1.6K Takip Edilen106 Takipçiler
ck1521
ck1521@ck1523·
@DeepKlee 我自己用4.6 opus也是这个感觉,聪明但是不突破 虽然可能因为我用它的时候都是指令式的(
中文
1
0
0
455
Klee Kawaii
Klee Kawaii@DeepKlee·
GPT-5.5 我认为是更加擅长 agentic coding,而非更加智能。它的指令遵循很棒,但是没有足够的思考。 我有这个体验是在 App 与它对话,它只会顺着我的框架去填充细节或者更加精准,突破性的设计或者论证方向全靠我辅导,极少有认知和思维上的增量或者说 insight。与能解决清晰问题但不能定义问题的感觉非常类似,有种优秀但平庸的美。 当然,这个是个人体感,GPT 在很有些方面蠢到我不是一天两天了,上次就没理解我设计的重排序缓冲区和背压传播,为了交付给我乱写一通引入额外复杂度,被我骂了一顿。
中文
9
0
39
10.3K
ck1521
ck1521@ck1523·
@gacha_tourist ppl who would agree with you are more likely already left the game like me, the shrink of the player base speaks for itself.
English
0
0
2
1.1K
saf 🌹
saf 🌹@gacha_tourist·
feels like i'm getting a weird amount of hate for this, even tho im right? lol idk if its recency bias but im kinda sad how everyone forgot the sheer quality of Season 1, it really made me remember what "ZZZ" truly stands for 🥲 Sadly we're never gonna get this again 😕
saf 🌹@gacha_tourist

@ketteithe7th Just a symbol of how far zzz has strayed from its original vision in Season 1 :/ now its just the same fantasy slop u get in hsr/genshin, I loved zzz for its original aesthetics that were more grounded and urban :( Also Belle and Wise are getting too powerful 🥲 stop these devs

English
142
331
5.8K
906.7K
ck1521 retweetledi
aLFie/RAUM
aLFie/RAUM@aLFie0321·
バーニス
aLFie/RAUM tweet media
日本語
47
7.3K
88.6K
867.2K
ck1521
ck1521@ck1523·
@manateelazycat 只能说我带新人也有类似的感觉,除非基础很好,热情很足否则完全感觉不到培养的必要性 相比几年前,对大多数普通新人算是非常不友好了 但是我不会发推说出来(所以你们是不是又要发新功能预热了
中文
1
0
0
490
Andy Stewart
Andy Stewart@manateelazycat·
应届毕业生在AI时代,除了专业技能,最重要的就是要更偏向客户,理解真正的需求,这样能更准确的向AI表述目标,边界和架构 在AI之前,产品需求主要由产品经理和项目经理去承担,AI之后,沟通链条大大缩短,每个个体都是需求链的重要一环,因为执行的成本已经低到忽略不计 怎么提高应届毕业生的竞争力?多研究用户需求,不一定要很多表达技巧,但是要通过不断的沟通,想清楚真正的需求是什么,思路清晰和沟通能力是专业知识外最重要的新技能 那怎么做呢?就是多沟通多聊,没有太好的办法,沟通是以后软件工程师必备的技能,而不是可选技能 AI时代是效率爆炸的时代,也是一个对新人特别不友好的时代,特别是很多不喜欢和人打交道的新人,怎么更快的理解需求真长的瓶颈不在于计算机学习,而在于你和人打交道的经验,在于你知道需求后的决断力和审美敏感性,如果你对用户的需求不敏感,对界面需求不敏感,对后端稳定性不敏感,真的很难和有经验工程师竞争 很多新人会问,没有上班哪有经验?道理是这个道理,只能在新人期间,更快地学习,无他法 那如果在一个企业无法快速成长呢?只能再换企业,直到快速成长起来,这个虽然很难,这也是每个新人必经的历练之路,生存的危机感是成长唯一的动力 这个时代,前端,后端,产品经理,项目经理的边界越来越模糊了,以后也许只有软件交付工程师了,原来需要一个团队做的事情,以后只需要1-2个人就可以搞定了,而人与人之间的区别不再是多少年软件研发经验,更多是需求感知能力,界面敏感能力和客户服务能力 诚实的分享我的所思所想,今年毕业的新人,你也许可以在发泄情绪的时候思考一下我说的有没有道理?
中文
16
6
76
41.3K
ck1521
ck1521@ck1523·
@skywind3000 我最近也在写这个,打算写个agent替我干活( tool calling可以用来做状态机这一点觉得挺有意思的
中文
0
0
0
20
ck1521
ck1521@ck1523·
@Marasu_ @ChihayaAnon_ze 不确定是不是因为以前有个说法,人对自然的感知更接近对数级而不是自然数级(对1/2-2倍更敏感,对+1+2无感)
中文
0
0
2
332
ck1521
ck1521@ck1523·
@landiantech 千禧年时代很多个人网站就是这个风格 突然有一种代沟了的感觉(
中文
0
0
0
6
蓝点网
蓝点网@landiantech·
不忍直视:模型聚合网站 #OpenRouter 推出复古风格,按照 1999 年画风设计,就是这配色着实让人不忍直视。其实单纯从网页结构来看这个复古风格还能接受,但为什么配色如此多而且还都是亮色和粉紫色的?这看着着实亮瞎眼:ourl.co/112492?x
蓝点网 tweet media蓝点网 tweet media蓝点网 tweet media蓝点网 tweet media
中文
11
1
53
7.2K
ck1521
ck1521@ck1523·
@manateelazycat 以前存了一些在线沙箱用来诊断恶意文件行为的,结果现在都不能用了……
中文
0
0
0
20
Andy Stewart
Andy Stewart@manateelazycat·
最近闹得沸沸扬扬的供应链攻击,其实从2024年就已经开始了 安全问题一直都隐藏在背后,推荐大家一个项目:TraceTree 它会自动对你电脑的 Python NPM DMG EXE 二进制文件进行分析,尝试提前找出那些潜在的风险 地址: github.com/tejasprasad200…
Andy Stewart tweet media
中文
9
20
115
13K
ck1521
ck1521@ck1523·
@vista8 Harness不是新名词,以前unit test里就有的概念,放在这个应用场景里挺准确的 另外收敛不了不是新词的问题,是harness这个词从以前就是笼统概念,不像context那么精确 所以有没有表达他们工作的本质其实是另一个问题,这是不同角度看问题的结果
中文
0
0
0
201
向阳乔木
向阳乔木@vista8·
让AI重写,内容如下: AI 圈最喜欢的游戏:给旧东西起个新名字 最近 AI 圈在传一篇关于「Harness Engineering」的长文,几万字,几乎可以确定是 AI 写的。 SGLang 社区的工程师 Chayenne Zhao 读完第一反应不是「好概念」。 而是心里想:这帮人除了给旧东西起新名字,还有别的想法吗? 这个吐槽深得我心。 从 Prompt Engineering 到 Context Engineering,现在又是 Harness Engineering。 每隔几个月,就有人造个新词,写一篇万字长文,引几个大厂案例,整个社区开始嗡嗡作响。 但你真的看进去,说的都是同一件事: 设计模型运行的环境,给它什么信息,用什么工具,怎么管理记忆,怎么拦截错误。 ChatGPT 发布第一天这件事就存在了。 换个名字不代表它变成了新学科。 抱怨归抱怨,Chayenne 随后写了她自己真实踩过的坑,这部分才是文章的价值所在。 她在为 SGLang 社区构建一个多智能体系统,自动回答用户的技术问题,比如怎么在 8 张 GPU 上部署 DeepSeek-V3,GLM-5 的 INT4 和 FP8 差距大不大。 最开始的想法是最朴素的:做一个全知 Agent,把 SGLang 所有文档、代码、Cookbook 全部塞进去,什么都能答。 结果当然失败了。 上下文窗口不是内存,塞得越多,模型注意力越分散,答案越差。 一个 Agent 同时要理解量化、PD 分解、扩散服务、硬件兼容性,结果哪个都不深。 最后跑通的设计是: 把 SGLang 文档按功能边界拆成若干独立的「子领域专家 Agent」,上面放一个 Expert Debating Manager,负责接收问题,分解子问题,查路由表激活对应 Agent,并行求解,再汇总答案。 这三条经验很实在: ① 信息给 Agent 要精不要多。 ② 复杂系统拆成专项子模块,不要造全知 Agent。 ③ 所有知识必须活在仓库里,口头约定不存在,路由和约束必须是结构化的,不能靠模型自己判断。 这些原则在传统软件工程里叫做关注点分离、单一职责、文档即代码。 现在搬到 LLM 环境里,有些人觉得这值得一个新名字。 Chayenne 觉得不值得给个新名字。 我同意后半段,但新名字这件事我看法不同,稍后说。 文章最后 Chayenne 抛了一个她自己也没想清楚的问题: 如果模型能力持续指数级增长,会不会有一天,模型强大到可以自己构建运行环境? 她提到 OpenClaw 这个项目,一个月内代码从 40 万行涨到 100 万行,主要由 AI 自己驱动完成。 那这个项目的环境是谁搭的,是人,还是 AI? 这个问题切中肯綮。 我们现在讨论的所有「工程实践」,包括 Harness Engineering 这个词背后真实存在的那些经验,前提都是人在主动设计 Agent 的运行环境。 但如果这件事本身将来也可以外包给 AI,那今天讨论的这些原则,两年后还剩多少是有效的? Chayenne 的答案是:至少今天,这还是人的工作,而且是最有价值的那种。 我觉得这个回答没问题,但藏着一种隐隐的不确定感,她自己也感受到了。 回到新名字这件事,我理解 Chayenne 的烦躁,但「给旧东西起新名字」这件事本身有时候不是坏事。 Prompt Engineering 这个词被造出来之前,做这件事的人是存在的,但圈子里没有形成共同语言,讨论和积累都很低效。 新词的价值在于它把散落的实践收敛成一个可以对话的概念。 问题在于,当造词速度超过真正的认知积累速度,那就是在消费注意力,不是在推进理解。 Harness Engineering 这个词现在是后者,但 Chayenne 在 how-to-sglang 上的踩坑记录,是实实在在的前者。
Chayenne Zhao@GenAI_is_real

Today I read a lengthy piece on Harness Engineering — tens of thousands of words, almost certainly AI-written. My first reaction wasn't "wow, what a powerful concept." It was "do these people have any ideas beyond coining new terms for old ones?" I've always been annoyed by this pattern in the AI world — the constant reinvention of existing concepts. From prompt engineering to context engineering, now to harness engineering. Every few months someone coins a new term, writes a 10,000-word essay, sprinkles in a few big-company case studies, and the whole community starts buzzing. But if you actually look at the content, it's the same thing every time: Design the environment your model runs in — what information it receives, what tools it can use, how errors get intercepted, how memory is managed across sessions. This has existed since the day ChatGPT launched. It doesn't become a new discipline just because someone — for whatever reason — decided to give it a new name. That said, complaints aside, the research and case studies cited in the article do have value — especially since they overlap heavily with what I've been building with how-to-sglang. So let me use this as an opportunity to talk about the mistakes I've actually made. Some background first. The most common requests in the SGLang community are How-to Questions — how to deploy DeepSeek-V3 on 8 GPUs, what to do when the gateway can't reach the worker address, whether the gap between GLM-5 INT4 and official FP8 is significant. These questions span an extremely wide technical surface, and as the community grows faster and faster, we increasingly can't keep up with replies. So I started building a multi-agent system to answer them automatically. The first idea was, of course, the most naive one — build a single omniscient Agent, stuff all of SGLang's docs, code, and cookbooks into it, and let it answer everything. That didn't work. You don't need harness engineering theory to explain why — the context window isn't RAM. The more you stuff into it, the more the model's attention scatters and the worse the answers get. An Agent trying to simultaneously understand quantization, PD disaggregation, diffusion serving, and hardware compatibility ends up understanding none of them deeply. The design we eventually landed on is a multi-layered sub-domain expert architecture. SGLang's documentation already has natural functional boundaries — advanced features, platforms, supported models — with cookbooks organized by model. We turned each sub-domain into an independent expert agent, with an Expert Debating Manager responsible for receiving questions, decomposing them into sub-questions, consulting the Expert Routing Table to activate the right agents, solving in parallel, then synthesizing answers. Looking back, this design maps almost perfectly onto the patterns the harness engineering community advocates. But when I was building it, I had no idea these patterns had names. And I didn't need to. 1. Progressive disclosure — we didn't dump all documentation into any single agent. Each domain expert loads only its own domain knowledge, and the Manager decides who to activate based on the question type. My gut feeling is that this design yielded far more improvement than swapping in a stronger model ever did. You don't need to know this is called "progressive disclosure" to make this decision. You just need to have tried the "stuff everything in" approach once and watched it fail. 2. Repository as source of truth — the entire workflow lives in the how-to-sglang repo. All expert agents draw their knowledge from markdown files inside the repo, with no dependency on external documents or verbal agreements. Early on, we had the urge to write one massive sglang-maintain.md covering everything. We quickly learned that doesn't work. OpenAI's Codex team made the same mistake — they tried a single oversized AGENTS.md and watched it rot in predictable ways. You don't need to have read their blog to step on this landmine yourself. It's the classic software engineering problem of "monolithic docs always go stale," except in an agent context the consequences are worse — stale documentation doesn't just go unread, it actively misleads the agent. 3. Structured routing — the Expert Routing Table explicitly maps question types to agents. A question about GLM-5 INT4 activates both the Cookbook Domain Expert and the Quantization Domain Expert simultaneously. The Manager doesn't guess; it follows a structured index. The harness engineering crowd calls this "mechanized constraints." I call it normal engineering. I'm not saying the ideas behind harness engineering are bad. The cited research is solid, the ACI concept from SWE-agent is genuinely worth knowing, and Anthropic's dual-agent architecture (initializer agent + coding agent) is valuable reference material for anyone doing long-horizon tasks. What I find tiresome is the constant coining of new terms — packaging established engineering common sense as a new discipline, then manufacturing anxiety around "you're behind if you don't know this word." Prompt engineering, context engineering, harness engineering — they're different facets of the same thing. Next month someone will probably coin scaffold engineering or orchestration engineering, write another lengthy essay citing the same SWE-agent paper, and the community will start another cycle of amplification. What I actually learned from how-to-sglang can be stated without any new vocabulary: Information fed to agents should be minimal and precise, not maximal. Complex systems should be split into specialized sub-modules, not built as omniscient agents. All knowledge must live in the repo — verbal agreements don't exist. Routing and constraints must be structural, not left to the agent's judgment. Feedback loops should be as tight as possible — we currently use a logging system to record the full reasoning chain of every query, and we've started using Codex for LLM-as-a-judge verification, but we're still far from ideal. None of this is new. In traditional software engineering, these are called separation of concerns, single responsibility principle, docs-as-code, and shift-left constraints. We're just applying them to LLM work environments now, and some people feel that warrants a new name. I don't know how many more new terms this field will produce. But I do know that, at least today, we've never achieved a qualitative leap on how-to-sglang by swapping in a stronger model. What actually drove breakthroughs was always improvements at the environment level — more precise knowledge partitioning, better routing logic, tighter feedback loops. Whether you call it harness engineering, context engineering, or nothing at all, it's just good engineering practice. Nothing more, nothing less. There is one question I genuinely haven't figured out: if model capabilities keep scaling exponentially, will there come a day when models are strong enough to build their own environments? I had this exact confusion when observing OpenClaw — it went from 400K lines to a million in a single month, driven entirely by AI itself. Who built that project's environment? A human, or the AI? And if it was the AI, how many of the design principles we're discussing today will be completely irrelevant in two years? I don't know. But at least today, across every instance of real practice I can observe, this is still human work — and the most valuable kind.

中文
15
7
94
25.6K
ck1521
ck1521@ck1523·
Release time 2028...
ck1521 tweet media
English
0
0
0
25
ck1521 retweetledi
ADHD Memes
ADHD Memes@ADHDForReal·
ADHD Memes tweet media
ZXX
8
5.3K
54.7K
393.2K
ck1521
ck1521@ck1523·
@blackanger 当前任务步骤真正需要的内容 ← 没有全量上下文怎么判断是个问题 另外这个事很多第三方是在做的,第一方不做可能有技术以外的考量
中文
0
0
0
132
AlexZ 🦀
AlexZ 🦀@blackanger·
我不太懂。 为什么像现在的 agent ,比如 claude code / codex 就不能实现一个「滑动窗口」机制来处理上下文。 不做有损压缩,而是把全量 Session 历史存入外部记忆,窗口里只放当前任务步骤真正需要的内容。 窗口不是按时间顺序固定滑动,而是按需求动态组装。 这样的话,我们就可以拥有无限上下文了,理论上。
中文
124
12
203
215.6K
ck1521
ck1521@ck1523·
@rxliuli 嗯我觉得跟提问题的方式不同也有关系,后续: 1. 雄性社会那个问题Gemini say yes 2. 雌性社会用社交替换暴力,会导致另一种形式的压迫 3. 早期人类社会俩都不是,但代价是牺牲个体性,不过有些有意思的例子(印第安人男人当酋长,但是主母可以罢免酋长) gemini.google.com/share/e258fb11… 需要新的社会形态(
中文
0
0
0
30
琉璃
琉璃@rxliuli·
@ck1523 英文与中文的回答倾向相当不同,前者似乎着力论证这对人类不适用,后者则更多聚焦于这个问题本身
中文
1
0
0
28
琉璃
琉璃@rxliuli·
看《动物世界》时,和 Claude 讨论到:为什么雄性主导的哺乳动物似乎从未建立过稳定且分散的权力结构?也就是说,雌性主导的动物里有一些正面例子,但雄性主导的动物中却一个也没有。
琉璃 tweet media琉璃 tweet media
中文
2
0
5
754
ck1521
ck1521@ck1523·
@rxliuli 以及在雄性优势(暴力,个人感觉还有冲动/冒险)没有收益的环境,可能会更容易形成母系社会(黑猩猩和大象的例子),但不绝对(土狼的例子) 但就人类社会而言应该还没走上这条路,人类目前还是活在资源稀缺社会( 讨论串(用了英文因为印象里LLM英文讨论会更专业) gemini.google.com/share/063ce160…
中文
1
0
0
48
ck1521
ck1521@ck1523·
@rxliuli 跟Gemini聊了一下这个问题,挺有意思 1. 倭黑猩猩这种社会的重要成因之一生存环境的资源富足 2. 人类社会在约95%的时间里(采摘捕猎时期)也是权力分散的社会,没有alpha-male 高度集权社会形态是从出现农业开始的 我个人理解:集权结构形成的核心原因是资源稀缺+可被个体占有
中文
1
0
0
54
ck1521
ck1521@ck1523·
@python_xxt May the best sentience win. 30年前的游戏里就出现过的idea(
中文
0
0
1
17
あおいとと
あおいとと@aoitoto393·
既視感の正体発見伝
あおいとと tweet mediaあおいとと tweet media
日本語
7
599
5.2K
222.8K