Poincaré

2.6K posts

Poincaré banner
Poincaré

Poincaré

@diffset

Do you know my header picture?

انضم Ekim 2014
460 يتبع38 المتابعون
تغريدة مثبتة
Poincaré
Poincaré@diffset·
茨威格是个精神活在旧时代的宅男。他也死在了新时代的黎明里。他的观点落后中国古人一个身位的精神境界:中国人讲究读万卷书,行万里路。没有实践的阅读,是脑力工作者的精神自杀。
停雲@tingyun97

#世界读书日 读书人共勉。

中文
0
0
1
877
Poincaré أُعيد تغريده
Andrej Karpathy
Andrej Karpathy@karpathy·
LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.
English
1.8K
3.8K
35.2K
8.6M
Poincaré
Poincaré@diffset·
@fromorient2023 这个人2023年在参议院推动了一个法案,要求须经绝对多数同意美国才能退出北约,以防止像他一样的某些宵小搞事情。😎
中文
0
0
0
46
東方來
東方來@fromorient2023·
這才是最大的新聞! 美國國務卿盧比奧接受福克斯新聞的採訪,放出震驚信息; “我們數十年來,投入數十億,上千億美元和駐軍,到頭來卻只能保衛歐洲。而在我們需要時,還沒要他們參加空襲,只是使用一下基地都被拒絕,那麼,我們為什麼還要留在北約”? 別猶豫,退出北約,勢在必行!
中文
182
429
3.4K
296.1K
北京小影子(演员
北京小影子(演员@MarshWatt776·
离 了 大 谱,今天家里人安排个相亲的,加了微信,第一句就是:”我有处女情结,请问你是处女吗?如果不是我们可以不用往下聊了…”,我真的差点想说老娘睡的人比你不知道强多少倍,人家都没要求我是处女?你哪位????,真想狂喷他的,但想想中间人难做就直接拉黑了
中文
334
1
195
198.6K
老A8集散地
老A8集散地@afterheater·
年轻时候的文涛兄浪费了一个绝佳机会,多美的人啊
老A8集散地 tweet media老A8集散地 tweet media老A8集散地 tweet media
中文
16
6
156
115.4K
Poincaré
Poincaré@diffset·
@Balder13946731 参议院版本的卢比奥推过一个必须经参议院绝对多数表决才能退北约的bill,block了国务院的卢比奥完成这件事
中文
0
1
13
956
Balder
Balder@Balder13946731·
特朗普说打完伊朗就要找北约各国秋后算账,甚至要直接退出北约。 普京穷极一切攻打乌克兰只是想让北约东扩的步伐减慢,难道只需要Trump一个Asset就把北约直接搞解散了??
中文
29
4
137
41.6K
Thomas
Thomas@Thomaskong96638·
要说岀国前最遗憾的事,大概就是那张养生中心的会员卡了。 当年充了5万,充一送一,等于10万额度。 我的点钟是68号技师。 68号居然是全省前射击冠军,婷婷袅袅的,目光清澈干净。 只要68号当班,周末我必然穿上运动服,假装出门跑步,其实是直奔她那里。 结果移民签证提前批下来,会员卡只消费了两三万就走了。 我不是心疼那点钱,而是心疼那段往事。 现在每每孤独的时候想起来,还是挺煎熬的。
中文
30
1
82
97.5K
Poincaré أُعيد تغريده
艾略特
艾略特@elliotchen100·
看了一下 CC 的 Memory 机制,不过如此嘛。 整套记忆系统的核心就是一个 MEMORY.md 文件,不超过 200 行,每次会话启动往上下文里一塞。记忆多了怎么办? 后台跑一个叫 AutoDream 的子进程,定期扫描、合并、修剪,确保塞得进去。 说白了就是:模型自己记不住,所以用文件系统 + LLM 自我管理来模拟记忆。 这个方案工程上很扎实,但有几个本质局限: 1. 存储和检索完全依赖文件系统 + Markdown,无法扩展到跨项目、跨 Agent 的场景,记忆是孤岛式的 2. 没有真正的语义索引,没有基于关联度的动态召回,200 行索引就是硬上限 3. AutoDream 的整合是规则驱动的(扫描、合并、修剪),不是认知驱动的,能去重压缩,但不能从经验中提炼出新认知 4. 没有遗忘曲线,没有记忆强化机制,记忆要么在要么被删,没有中间态 做 Memory 做久了你会发现,这类方案的天花板其实不在工程,在架构。只要模型的注意力机制本身不支持大规模历史上下文的高效检索,应用层就永远在打补丁。 这也是为什么我们在 EverMind 选了一条不同的路。前阵子发的 MSA(Memory Sparse Attention)就是在 Transformer 注意力层直接做内容感知的稀疏路由,让模型自己学会"想起什么、忽略什么",而不是靠外部脚本替它决定。 A 社的工程能力毫无疑问是顶级的。但这次泄露恰好说明:Agent Memory 这个问题,远没有被解决。
中文
40
110
795
118.8K
Poincaré أُعيد تغريده
WquGuru🦀
WquGuru🦀@wquguru·
Claude Code源代码泄漏,包含六张核心状态图,: - 主查询状态机,理解主 query loop 的主干 - Tool Execution状态机,理解 tool 调度与并发/中断 - 压缩恢复策略,理解上下文压缩与恢复 - Agent生命周期状态机和SDK会话状态机,分别理解 subagent 生命周期和SDK 会话层 - 权限策略流程图,补齐治理与安全控制逻辑 主查询状态机:
WquGuru🦀 tweet media
Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

中文
18
89
399
109.1K
Poincaré
Poincaré@diffset·
@royxy 他taco了就不崩了?被扎了一刀,把眼睛蒙上,就死不了了?美元的信用体系不要了?美元汇率崩了,美国经济能扛得住?
中文
0
0
2
317
骆逸
骆逸@royxy·
这个我同意,他现在撤退,丢了一时的面子,丢了一众海湾盟友,但损失比硬着头皮打下去让帝国全面崩溃还是代价小很多的。
小丑能早起@jokergetupearly

@royxy 特朗普发动战争可能是错的,但是特朗普放任霍尔木兹海峡被封锁割肉止损是对的。

中文
28
3
76
28.7K
Poincaré
Poincaré@diffset·
@caiziboshi 没有海峡自由通行。美国人用一倍的油价买了一堆和自己无关的东西。这挺maga的🤭
中文
0
0
0
84
蔡子博士Chris
蔡子博士Chris@caiziboshi·
3/30美国国务院发布,卢比奥国务卿再次明确:以下是伊朗行动的具体目标。你应该把它们记录下来: 1. 摧毁伊朗空军 2. 摧毁其海军 3. 大幅削弱其导弹发射能力 4. 摧毁其工厂 政权更迭不是目标!第四条是相对于第一天新加的!
中文
18
1
24
13.7K