Sabitlenmiş Tweet
Jun-Jun Wan
475 posts

Jun-Jun Wan
@_zenzoen
entrepreneur. generalist. contrarian. polyglot. a curious mind. building the future of spontaneous agentic orders. ex-autonomous-driving.
digital Katılım Mayıs 2020
232 Takip Edilen53 Takipçiler
Jun-Jun Wan retweetledi

New policy from @Atlassian:
Unless you opt out by August 17th 2026, data from Jira and Confluence will automatically be used for AI training. Some data cannot be opted out at all on some plans.
x.com/kepano/status/…


kepano@kepano
if your data is stored in a database that a company can freely read and access (i.e. not end-to-end encrypted), the company will eventually update their ToS so they can use your data for AI training — the incentives are too strong to resist
English

在用 Hermes Agent 的时候,一直感觉跨 session 的中文搜索不太好用。Agent明明说过的话,session_search 就是搜不到,像完全失忆了一样....
比如昨天搞了一天的A2A通信,但今天再搜"和其他Agent的聊天记录"的时候,返回 0 条结果。但数据确实在数据库里——用 LIKE子串匹配能搜出 20 多条
然后我让 Claude Code 帮忙查了一下,原因是 SQLite FTS5的默认分词器。英文有空格,天然能按词分
中文没有空格,FTS5把每个汉字当成独立 token。搜"记忆断裂"实际执行的是 记 & 忆 & 断 & 裂——四个独立的字分别匹配,丢失了词的连续性,搜不到原本存在的内容
日文、韩文也一样,都没有空格分隔,都会中招,但英文用户完全不受影响。
目前的解决方案是 FTS5 搜不到结果时,检测到查询包含 CJK 字符就自动降级到 LIKE子串匹配。比 FTS5 慢一点(全表扫描),但对 agent 的数据量完全够用。
已提 PR:
github.com/NousResearch/h…
@NousResearch @Teknium

中文



