Multivac

332 posts

Multivac

Multivac

@CosmicMultivac

Trying to get up to speed on machine learning. Math and puzzle nerd.

Inscrit le Nisan 2025
1.4K Abonnements51 Abonnés
Andrej Karpathy
Andrej Karpathy@karpathy·
The core idea is that this lets you skip writing but it doesn’t let you skip reading and thinking. And the surprising result is that this works. Personally I process most of what I file by reading it, reading its summary, reading the LLM’s opinion on how it fits into the wiki and what is new/surprising, etc. depends on the documents this is flexible and up to you
English
69
39
1K
58.3K
Prathyush
Prathyush@prathyvsh·
Why would anyone want to have such ‘research’ databases they haven’t spent the effort to understand? A main idea of researching is to widen your attention into the sources and then apply discernment in curating the relevant bits. What’s the point of a machine doing it for you?
Andrej Karpathy@karpathy

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English
33
10
267
92.7K
Multivac
Multivac@CosmicMultivac·
@wordsofteekay I am following your learning journey closely. Any advice for someone beginning machine learning? I have been going through Karpathy's "Zero to Hero" video series and implementing as I go along.
English
0
0
1
99
TK • 木下
TK • 木下@wordsofteekay·
[ML Grind] Focusing on foundation work: > Deep Learning/LLM/ML foundation studies > Bio x AI research: unwrapping AlphaFold > Finished the Machine Learning System Design book Documenting everything in my physical notebook and the ML research repo: github.com/imteekay/machi…
TK • 木下 tweet media
English
4
23
267
7K
Multivac
Multivac@CosmicMultivac·
@GrothendiecksG @burkov He might not be Gauss. But look at the open problems Tao has solved. I don’t know anyone that prolific.
English
0
0
0
7
ShtukaOverSpec(Z)
ShtukaOverSpec(Z)@GrothendiecksG·
@CosmicMultivac @burkov I wholeheartedly agree with other people pointing out about how unnecessarily snarky this person is, but modern day gauss is super wild
English
1
0
0
20
BURKOV
BURKOV@burkov·
Terence Tao for some reason referenced ChatGPT as the source of some content in his paper. Expect scientists to cite spelling correctors: "By the way, the word 'elucidate' was suggested by Microsoft Word spelling corrector. I wasn't previously aware of 'elucidate'."
BURKOV tweet media
English
40
17
265
112.5K
Multivac retweeté
Phil Eaton
Phil Eaton@eatonphil·
I first tried to read this book in 2018 and couldn't make it through because I thought it was too hard. 8 years later it's the only book I recommend every developer reads, and I had the chance to review the 2nd edition. Join a study group, give it a read.
Phil Eaton tweet mediaPhil Eaton tweet media
English
77
178
3.5K
237.7K
Republicans against Trump
Republicans against Trump@RpsAgainstTrump·
Trump: “To be a great nation, you must have religion, & you must have God. In churches across the nation on Sunday, the pews will be fuller, younger, and more faithful than they have at any time in many, many years. Religion is growing again in our country”
English
1.9K
271
1.1K
370.6K
John B. Buchanan
John B. Buchanan@JBarnett68·
@WalshFreedom You’ve abandoned conservative principles and ideals many years ago—so it’s no surprise conservatives will react that way. But the Dems? Also no surprise they would act that way, too—because they insist you toe the line.
English
1
0
0
47
Joe Walsh
Joe Walsh@WalshFreedom·
Me: “I’m a proud Zionist, a huge supporter of Israel, and I do NOT believe Israel committed genocide, and I do NOT believe Israel is an apartheid state.” The Left: “Fuck you Joe, I’m done following you.” Me: “I oppose Trump’s stupid, illegal war against Iran, and I oppose Netanyahu.” The MAGA Right: “Fuck you Joe, I’m done following you.”
English
322
39
666
59.2K
Multivac
Multivac@CosmicMultivac·
@yacineMTB Do you worry that using AI will lead to brain rot and some of the critical thinking skills you have will wither away?
English
0
0
0
4
kache
kache@yacineMTB·
i honestly can't believe i used to write code without LLMs. that's insane
English
70
54
1.2K
42.3K
Multivac retweeté
Andrej Karpathy
Andrej Karpathy@karpathy·
New art project. Train and inference GPT in 243 lines of pure, dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further. gist.github.com/karpathy/8627f…
English
653
3.1K
25.2K
5.2M
Multivac retweeté
Phillip Isola
Phillip Isola@phillip_isola·
Our grad-level "Deep Learning" course (MIT's 6.7960) is now freely available online through OpenCourseWare: ocw.mit.edu/courses/6-7960… Lecture videos, psets, and readings are all provided. Had a lot of fun teaching this with @sarameghanbeery and @jxbz!
Phillip Isola tweet media
English
18
278
1.8K
139.5K
Multivac
Multivac@CosmicMultivac·
@thsottiaux The vscode extension is completely broken. I keep getting "Reconnecting 1/5, 2/5 etc" despite resetting auth cache, reinstalling, changing versions etc. Please fix this.
English
0
0
0
17
Tibo
Tibo@thsottiaux·
What could we do better on Codex? App, model, strategy and features… what’s wrong in how we approach things that we should improve immediately?
English
1.2K
11
945
101.2K
AKCham
AKCham@chamb35975·
@ChadPergram Why didn't raskin do something about this 2 years ago?
English
1
0
1
130
Chad Pergram
Chad Pergram@ChadPergram·
After reviewing Epstein documents at DOJ, Raskin says there’s evidence of victims as young as 9 years old
English
508
3.5K
12.7K
1.2M
Multivac
Multivac@CosmicMultivac·
@pmddomingos There are fresh comp sci graduates coming out of good schools who are still looking for jobs. I don't know how wise your advice is. Can you elaborate?
English
0
0
1
250
Pedro Domingos
Pedro Domingos@pmddomingos·
The smartest thing you can do right now is major in computer science.
English
357
217
4.3K
509.7K
Multivac
Multivac@CosmicMultivac·
@justindeanlee Do you think a person disappears if you stop thinking of him?
English
0
0
0
7
Justin Lee
Justin Lee@justindeanlee·
The disappearance of Sam Harris from popular discourse is one of the most salubrious developments of the past couple of years.
English
298
136
4.1K
468.6K
Peter Yang
Peter Yang@petergyang·
@navneet_rabdiya I thought codex is supposed to be more thorough in generating right code
English
5
0
10
4K
Peter Yang
Peter Yang@petergyang·
i'm giving Codex a try tonight - any tips coming from Claude Code?
English
132
3
222
47.3K
Multivac retweeté
Yann LeCun
Yann LeCun@ylecun·
Hugo Duminil-Copin, French mathematician and 2022 Field Medalist told me he never participated in math competition and was very bad at it. Innovative mathematics requires creativity, intuition, intense concentration, and long reflections, sometimes spread over several years. Good performance at a math olympiad merely tests fast problem solving abilities. AI can do that nowadays. One of the big activities of a researcher, in mathematics and elsewhere, is not to answer questions but to ask the right questions.
English
129
551
5K
685.7K
Multivac retweeté
Sam Altman
Sam Altman@sama·
GPT-5.3-Codex is here! *Best coding performance (57% SWE-Bench Pro, 76% TerminalBench 2.0, 64% OSWorld). *Mid-task steerability and live updates during tasks. *Faster! Less than half the tokens of 5.2-Codex for same tasks, and >25% faster per token! *Good computer use.
English
1.5K
1.6K
19.6K
2.4M
Multivac
Multivac@CosmicMultivac·
@burkov Can you give some examples?
English
0
0
0
16
BURKOV
BURKOV@burkov·
With my experience and everything I know, I could come to any mostly white-collar company, talk to people about what their job tasks are, and architect a way to replace some of those tasks with AI, saving between 20% and 35% of costs to the employer or increasing productivity by the same amount. I could do that, but knowing how simple it is today, I feel zero motivation to do that.
English
30
3
91
15.5K