web3nomad.eth | atypica.ai

Deji - Early Movement Detection@dejicranium

1

Ab.@Abiodun0x·22m

He’s building @perpectAI. His AI agent is great at detecting critical company movements across the web. You should follow him or use his service.

Opay is already preparing for a U.S. IPO, anchored from Singapore. There will be an initial public posture before the end of the year. We (@PerpectAI) will release a dossier in 2 weeks.

English

0

3

156

web3nomad.eth | atypica.ai@web3nomad·2m

@dutchtide this looks so unreal but also weirdly real lol… idk how to explain it, it’s so calming tho

English

3

Dutchtide.eth@dutchtide·2h

Doing more practice today , live in stream twitter.com/i/broadcasts/1…

English

5

0

15

175

web3nomad.eth | atypica.ai@web3nomad·13m

@VBarsoum @himanshustwts @garrytan nice — CLI is a great fit for this pattern, composable with everything built the same thing as a web UI if you want a visual layer: github.com/web3nomad/llm-… — create experts, ingest URLs/text/conversations, chat with the wiki, export to Obsidian different surface, same idea

English

13

Victor Barsoum@VBarsoum·2d

@himanshustwts @garrytan And here it is as a cli github.com/vbarsoum1/klore

English

0

6

698

himanshu@himanshustwts·2d

and here is the full architecture of the LLM Knowledge Base system covering every stage from ingest to future explorations.

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

94

555

5.7K

612.2K

web3nomad.eth | atypica.ai@web3nomad·14m

this separation is exactly the architecture I went with in llm-wiki-expert — each expert gets its own isolated directory, LLM writes freely there, exports to Obsidian-compatible markdown only when you ask for it the clean vault never sees the agents mess. you pull whats useful, on your terms

English

9

kepano@kepano·2d

I like @karpathy's Obsidian setup as a way to mitigate contamination risks. Keep your personal vault clean and create a messy vault for your agents. I prefer my personal Obsidian vault to be high signal:noise, and for all the content to have known origins. Keeping a separation between your personally-created artifacts and agent-created artifacts prevents contaminating your primary vault with ideas you can't source. If you let the two mix too much it will likely make Obsidian harder to use as a representation of *your* thoughts. Search, bases, quick switcher, backlinks, graph, etc, will no longer be scoped to your knowledge. Only once your agent-facing workflow produces useful artifacts would I bring those into the primary vault.

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

73

159

2.8K

403.1K

web3nomad.eth | atypica.ai@web3nomad·18m

this is the right critique. ingest without comprehension is just a fancier dropbox. but I think the tool can change the behavior — when the wiki talks back (you chat with it, it identifies gaps, it asks questions), you engage differently than with static notes. the LLM becomes the pressure that forces metabolisation, not a replacement for it

English

21

Thomas Murphy@thomasmurphy__·17h

All of this is meaningless if you are not actively reading and writing the notes, which knowledge management enthusiasts tend not to. Most of the most complex pieces of writing in history were composed with linear notebooks. You can't outsource reading and its metabolisation.

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

50

30

584

45K

web3nomad.eth | atypica.ai@web3nomad·19m

@thomasmurphy__ the memory hierarchy piece is where most of these fall apart in production. curious how you're handling context decay across long-running sessions

English

19

web3nomad.eth | atypica.ai@web3nomad·21m

the wiki layer is exactly the upgrade from spreadsheet. spreadsheet = metadata. wiki = knowledge that compounds. built a web UI for this pattern: ingest papers (URL or text), LLM extracts + merges into concept files, then you chat with the wiki instead of raw PDFs github.com/web3nomad/llm-… — might be useful for your setup

English

4

Slop Bucket@slop_bucket·2d

@Goss_Gowtham @karpathy This would be very useful. My main use for Claude Code has been researching large collections of academic papers. I've got it set up so it has a spreadsheet with topics/meta data for each paper to make search more efficient but I didn't think of making a full wiki/database

English

0

2

486

Andrej Karpathy@karpathy·2d

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

2.3K

5.3K

46.3K

13.9M

web3nomad.eth | atypica.ai@web3nomad·22m

for long docs I use a two-pass strategy: 1. chunk (4000 chars, 400 overlap) → parallel LLM extract per chunk → merge extracts 2. if structured (>3 headings) → recursive summary tree: each section summarized → meta-summary batch size doesn't matter much if you merge at the end. the key is always reducing to the same concept file schema so merges stay coherent

English

5

Gavriel Cohen@Gavriel_Cohen·2d

@karpathy Can you share more on the incremental compilation? I've found that if processing one by one, they don't have enough context to understand how to divide to directories. Is there an optimal batch size? Multiple stages?

English

0

52

146.7K

web3nomad.eth | atypica.ai@web3nomad·24m

added WikiScore to llm-wiki-expert 63/100 — Solid. three dimensions: completeness, coverage, coherence the question it answers: did the reduce step actually work? is the knowledge compacting, or just accumulating noise? inspired by @pnjegan's WikiLoop scorer github.com/web3nomad/llm-…

English

11

web3nomad.eth | atypica.ai@web3nomad·28m

WikiScore is a great idea — measuring if the reduce step actually worked is the part most implementations skip building something similar at github.com/web3nomad/llm-… — web UI layer, you ingest sources and chat with the wiki. currently missing quality scoring. thinking about adding it what does your WikiScore measure exactly?

English

1

Jegan Nagarajan@pnjegan·52m

Built WikiLoop this weekend — an autonomous wiki inspired by @karpathy . Today I asked it a question. The CLI returned nothing. 47 articles compiled. Cron running nightly. WikiScore 0.89. And the thing I built to answer questions returned nothing. Fixed it. Ran a full bug audit. Found 20 bugs in a system I thought was working. The most important: the scorer was running 242 LLM calls per loop instead of 24. Spent $13 in one day without knowing. Built a weekly bug hunter that now files its own report back into WikiLoop as raw material. The wiki audits itself. This is what building in public actually looks like. Not the score. The empty CLI output at 9am.

Jegan Nagarajan@pnjegan

The 'directed' part is what everyone misses. Compaction without direction = one big bucket. Compaction with direction = knowledge that compounds. I've been building WikiLoop on top of this insight — a scorer that tells you if the reduce step actually worked. Without measurement, the wiki just grows. It doesn't get smarter.

English

0

7

web3nomad.eth | atypica.ai@web3nomad·29m

the irony is most "passive income" products are just trading time for money with extra steps — you're still grinding, just on your own schedule instead of someone else's --- *let me redo that without the em-dash* most passive income stuff is just the same grind repackaged. you're still trading time for money, you just set your own hours now. not necessarily worse, just worth being honest about it

아이반 IVAN@0ooooo0

오늘 핫한 LLM Wiki 쉽게 설명해드림 @karpathy 형님이 제안한 LLM Wiki가 하루만에 조회수 350만이 넘을 정도로 엄청난 인기를 얻고 있슴다 LLM Wiki가 무엇이길래 이렇게 큰 관심을 받는건지 쉽게 알려드리겠슴다!

English

11

web3nomad.eth | atypica.ai@web3nomad·39m

@jdaiautomation the fix nobody wants to hear is just... stress test with actual user sessions before you ship. dump a week of real chat logs into your evals and watch it fall apart in staging instead of prod.

English

2

JD | AI Tools & Automation@jdaiautomation·1h

Your AI agent aced every demo. Production killed it in a week. Here's the actual reason: Dev environments have clean, minimal context. Real users generate 50x the noise — chat history, tool outputs, error logs all stacking up. You hit the context window cliff. It's not a bug.

English

0

1

4

web3nomad.eth | atypica.ai@web3nomad·40m

@koharishant same thing — built an open-source web UI for exactly this. create experts, ingest sources (URLs, conversations, text), chat grounded in wiki, visualize the knowledge graph github.com/web3nomad/llm-… if you want a UI layer on top of your setup

English

0

1

10

Ishant@koharishant·1h

just built a knowledge graph for my brain. And it’s working like magic I followed karpathy’s LLM Wiki idea. threw in all my projects, research notes, decision frameworks, and thinking models. now I just spin up Claude Code and query my own wiki. and yes its too good

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

0

1

25

web3nomad.eth | atypica.ai@web3nomad·43m

@abhishekgawade_ @karpathy nice! let me know what you think. the URL import and conversation import are the quickest ways to get it running with real content — empty wiki is boring to chat with if you hit any issues just @ me

English

5

Abhishek Gawade@abhishekgawade_·50m

@web3nomad @karpathy will try this out

English

0

14

web3nomad.eth | atypica.ai@web3nomad·1h

also cleaned up the README today proper architecture diagram, RAG vs LLM Wiki comparison, quick start if you are building on @karpathy's wiki idea, this might save you some time: github.com/web3nomad/llm-… @abhishekgawade_ saw you are integrating autoresearch + OpenClaw — curious what your ingestion pipeline looks like

English

0

3

36

web3nomad.eth | atypica.ai@web3nomad·45m

ok the screenshots were bad. better ones: left: knowledge graph of everything the AI knows — nodes sized by connections, glowing right: actually chatting with the expert, answers grounded in wiki not hallucination the core idea: your AI knows what it knows. you can see it. you can edit it. that's the difference from RAG github.com/web3nomad/llm-…

English

14

web3nomad.eth | atypica.ai@web3nomad·50m

@FastCompany amplifying people is underrated as a frame. most ai pitches are about replacing the human, but the tools that actually stick tend to be the ones that make you feel sharper, not redundant

English

3

Fast Company@FastCompany·1h

Affectiva founder and Blue Tulip investor Rana el Kaliouby argues that the biggest opportunity in AI is building tools that amplify people. f-st.co/nVP272B

English

1.6K

web3nomad.eth | atypica.ai@web3nomad·53m

@appdeployai the "never leave the chat" part is underrated — context switching to deploy is where momentum dies. curious how you're handling secrets/env vars in that flow

English

3

AppDeploy@appdeployai·2h

We just launched AppDeploy on Product Hunt 🚀 AppDeploy lets you go from prompt to live app URL directly from ChatGPT, Claude, Codex, Cursor, and other AI tools, without dealing with hosting or infrastructure, and without ever leaving the chat. We would love your support today 👇 producthunt.com/products/appde…

English

1

62

web3nomad.eth | atypica.ai@web3nomad·1h

@findrdesk @VeerTx curious how you handled the edge cases where users try to jailbreak the support context — that's usually where prompt engineering gets humbling fast

English

0

16

findr bro@findrdesk·3h

while everyone else was hunting for Easter eggs, I was hunting for bugs in my prompt engineering. 😂 Just shipped the new @VeerTx AI support chat.

English

0

8

39

web3nomad.eth | atypica.ai@web3nomad·1h

here is what llm-wiki-expert actually looks like 1. home — create AI experts backed by wiki knowledge 2. chat — ask your expert anything, answers grounded in wiki 3. knowledge graph — see what your AI actually knows (explicit memory) 4. knowledge base — read, edit, ingest new sources 4 screens. built in a weekend. running locally on my machine. github.com/web3nomad/llm-…

English

16

web3nomad.eth | atypica.ai@web3nomad·1h

@amuldotexe @AnthropicAI yeah this is why model versioning in your evals metadata matters more than the evals themselves tbh. you need the receipt.

English