Andrej Karpathy

10K posts

Andrej Karpathy banner
Andrej Karpathy

Andrej Karpathy

@karpathy

I like to train large deep neural nets. Previously Director of AI @ Tesla, founding team @ OpenAI, PhD @ Stanford.

Stanford 가입일 Nisan 2009
1.1K 팔로잉2M 팔로워
고정된 트윗
Andrej Karpathy
Andrej Karpathy@karpathy·
The hottest new programming language is English
English
1.8K
7.7K
60.4K
10.7M
Andrej Karpathy
Andrej Karpathy@karpathy·
@shikhr_ Yeah I have 4 blog posts that I didn’t finish yet this is one of them. Dobby runs my entire house over WhatsApp. Lights, shades, pool/spa, sonos, security HVAC etc
English
18
5
219
17.2K
Shikhar
Shikhar@shikhr_·
@karpathy Dobby the House Elf claw??? What did I miss!
English
1
0
35
15.7K
Andrej Karpathy
Andrej Karpathy@karpathy·
Thank you Jensen and NVIDIA! She’s a real beauty! I was told I’d be getting a secret gift, with a hint that it requires 20 amps. (So I knew it had to be good). She’ll make for a beautiful, spacious home for my Dobby the House Elf claw, among lots of other tinkering, thank you!!
NVIDIA AI Developer@NVIDIAAIDev

🙌 Andrej Karpathy’s lab has received the first DGX Station GB300 -- a Dell Pro Max with GB300. 💚 We can't wait to see what you’ll create @karpathy! 🔗 #dgx-station" target="_blank" rel="nofollow noopener">blogs.nvidia.com/blog/gtc-2026-… @DellTech

English
483
768
17.5K
856.2K
Andrej Karpathy
Andrej Karpathy@karpathy·
Ugh X breaks time links, it’s at 26:17
English
20
4
286
57.2K
Andrej Karpathy
Andrej Karpathy@karpathy·
@Yulun_Du @ilyasut SGD is a ResNet too (the blocks of it are fwd+bwd), the residual stream is the weights so... 🤔 We're not taking the Attention is All You Need part literally enough? :D
English
28
38
577
90.6K
Andrej Karpathy
Andrej Karpathy@karpathy·
@ChristosTzamos Wait this is so awesome!! Both 1) the C compiler to LLM weights and 2) the logarithmic complexity hard-max attention and its potential generalizations. Inspiring!
English
27
39
1.3K
33.9K
Christos Tzamos
Christos Tzamos@ChristosTzamos·
1/4 LLMs solve research grade math problems but struggle with basic calculations. We bridge this gap by turning them to computers. We built a computer INSIDE a transformer that can run programs for millions of steps in seconds solving even the hardest Sudokus with 100% accuracy
English
239
786
5.9K
1.6M
Andrej Karpathy
Andrej Karpathy@karpathy·
@rasbt @teortaxesTex Ty, I used your blog post, exported with obsidian ext into markdown, used it to enqueue ideas into my autoresearch loop
English
11
1
109
6.1K
Andrej Karpathy
Andrej Karpathy@karpathy·
@_kaitodev @Ignaci0m_ The "exposure" was scored by an LLM based on how digital the job is. This has no baring on what actually happens to these occupations, which has to do with demand elasticity and a lot more. People are sensationalizing the visualization tool and putting words in my mouth.
English
18
9
144
13K
Kaito | 海斗
Kaito | 海斗@_kaitodev·
thank you 🙏 the people’s reaction was overblown and completely justified at the same time honestly the data itself is still the most interesting part to me. even if the scoring methodology is somehow rough, the pattern it reveals ( that exposure tracks almost perfectly with “can you do this from a laptop” ) is a real insight worth checking out ppl weren’t panicking about the data but more about what the data actually confirmed.
English
2
0
10
4.7K
Kaito | 海斗
Kaito | 海斗@_kaitodev·
5 minutes ago, @karpathy just dropped karpathy/jobs! he scraped every job in the US economy (342 occupations from BLS), scored each one's AI exposure 0-10 using an LLM, and visualized it as a treemap. if your whole job happens on a screen you're cooked. average score across all jobs is 5.3/10. software devs: 8-9. roofers: 0-1. medical transcriptionists: 10/10 💀 karpathy.ai/jobs
Kaito | 海斗 tweet media
English
967
1.8K
12.1K
3.5M
Andrej Karpathy
Andrej Karpathy@karpathy·
This was a saturday morning 2 hour vibe coded project inspired by a book I’m reading. I thought the code/data might be helpful to others to explore the BLS dataset visually, or color it in different ways or with different prompts or add their own visualizations. It’s been wildly misinterpreted (which I should have anticipated even despite the readme docs) so I took it down.
English
25
7
131
11.9K
Zhikai Zhang
Zhikai Zhang@Zhikai273·
🎾Introducing LATENT: Learning Athletic Humanoid Tennis Skills from Imperfect Human Motion Data Dynamic movements, agile whole-body coordination, and rapid reactions. A step toward athletic humanoid sports skills. Project: zzk273.github.io/LATENT/ Code: github.com/GalaxyGeneralR…
English
162
640
4.1K
1.3M
Andrej Karpathy
Andrej Karpathy@karpathy·
@vivek_2332 Yep, exactly and agree! Any process with a lot of knobs and objective criteria benefits a lot.
English
15
9
433
38.7K
Vivek
Vivek@vivek_2332·
introducing autoresearch-rl, autonomous research for rl post-training. inspired by @karpathy autoresearch, and i think rl post-training is honestly one of the places where this idea fits perfectly. there are at least 50+ hyper parameters to tweak, learning rate, batch size, rollouts, clipping ratios, kl penalties, schedulers, the list goes on. instead of sitting there for hours turning knobs one at a time, just let the model figure out the right starting config on its own. some things worth mentioning: -> built on @PrimeIntellect prime-rl (my favourite rl post-training framework) and @willccbb verifiers for reward verification. -> ran qwen2.5-0.5b-instruct on gsm8k across 60+ autonomous experiments. eval score went from 0.475 to 0.550 and the agent actually found a way to do it in fewer steps (20 instead of 30). less compute, better results -> the whole thing was surprisingly smooth to set up and run. point the agent at the config, go to sleep, wake up to a full experiment log. i really wish i could try this on a bigger model but gpu poor for now lol -> the agent discovers things you wouldn't think to try. like how rollouts = 4 beats rollouts = 8, or how a constant lr schedule outperforms cosine. it just methodically tests everything i think the real value here is that rl training is so fragile and noisy that having an agent patiently run experiment after experiment is genuinely more effective than a human doing it manually. check it out: github.com/vivekvkashyap/…
Vivek tweet media
English
22
53
749
78.3K
Andrej Karpathy
Andrej Karpathy@karpathy·
My autoresearch labs got wiped out in the oauth outage. Have to think through failovers. Intelligence brownouts will be interesting - the planet losing IQ points when frontier AI stutters.
English
532
300
7K
566K
Andrej Karpathy
Andrej Karpathy@karpathy·
Human orgs are not legible, the CEO can’t see/feel/zoom in on any activity in their company, with real time stats etc. I have no doubt that it will be possible to control orgs on mobile, with voice etc., but with this level of legibility will that be optimal? Not in principle and asymptotically but in practice and for at least the next round of play.
English
82
49
1.2K
204.2K
Andrej Karpathy
Andrej Karpathy@karpathy·
All of these patterns as an example are just matters of “org code”. The IDE helps you build, run, manage them. You can’t fork classical orgs (eg Microsoft) but you’ll be able to fork agentic orgs.
Andrej Karpathy tweet media
English
166
241
3.5K
400.9K
Andrej Karpathy
Andrej Karpathy@karpathy·
Expectation: the age of the IDE is over Reality: we’re going to need a bigger IDE (imo). It just looks very different because humans now move upwards and program at a higher level - the basic unit of interest is not one file but one agent. It’s still programming.
Andrej Karpathy@karpathy

@nummanali tmux grids are awesome, but i feel a need to have a proper "agent command center" IDE for teams of them, which I could maximize per monitor. E.g. I want to see/hide toggle them, see if any are idle, pop open related tools (e.g. terminal), stats (usage), etc.

English
791
832
10.5K
2.3M
Amit Prakash
Amit Prakash@amit05prakash·
@karpathy I can finally justify my reasons for buying more monitors now
English
3
0
89
37.9K
Andrej Karpathy
Andrej Karpathy@karpathy·
@trongthangpham @maxbittker ralph loop runs headless. i dislike headless sessions. i need to see and supervise agent work, possibly ask /btw questions of them, possibly pitch in ideas to the mix, etc etc.
English
21
4
200
10.3K
Trong-Thang Pham
Trong-Thang Pham@trongthangpham·
@karpathy @maxbittker I thought your version is similar to the ralph loop (the bash one) so it would loop forever. Is that not the case here?
English
1
1
13
8K
max
max@maxbittker·
From @karpathy's autoresearch .md
max tweet media
English
50
124
3.1K
219.5K
Andrej Karpathy
Andrej Karpathy@karpathy·
@nvbkdw @nummanali yes, solid work trending in a good direction, but almost all my work is across like 20 different machines (my local, my claw machine, my gpu machines). possibly they could add ssh mode, a bit like VS Code does (for the same reasons).
English
16
1
110
15.3K
Numman Ali
Numman Ali@nummanali·
Claude Code teams with tmux is really cool When you run with team mode enabled in tmux, it automatically opens the additional terminal in pane I don't really get my main agent to orchestrate, I chat to them myself CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=true claude
Numman Ali tweet media
English
62
75
1.4K
186.7K