Spaghetti Eddie

3.5K posts

Spaghetti Eddie banner
Spaghetti Eddie

Spaghetti Eddie

@DeepDiscussion1

Street Epistemology Youtuber, Dad, Tesla fanatic.

Tyler, TX شامل ہوئے Nisan 2017
221 فالونگ408 فالوورز
Farzad 🇺🇸 🇮🇷
Farzad 🇺🇸 🇮🇷@farzyness·
I asked Grok 420 to dumb this down for all of us retards: ### What actually happened Karpathy (a famous AI guy) built a little AI helper called “autoresearch.” He left it alone for 2 days on a tiny version of his project called nanochat. That little AI ran **700 experiments by itself**. It kept looking at the results, learning what worked, and trying smarter ideas next — exactly like a human researcher would do. It discovered **~20 real improvements** that no one (including Karpathy) had found in years of manual tweaking. He tested them on a bigger model and they all added up nicely. Result? His project now trains **11% faster** — the official leaderboard score dropped from 2.02 hours to 1.80 hours. That might sound small, but in AI it’s a *huge* win. ### Why this feels like a giant deal Normally Karpathy (and every AI engineer) does this the old-fashioned way for 20 years: - Think of an idea - Code it - Test it - Read papers for inspiration - Repeat forever It’s slow, tiring, and you miss stuff. This time the AI did the **entire loop by itself** — no human in the loop for days. It spotted things like: - “Hey, you forgot a tiny knob that makes the model pay attention better” - “Your model likes extra guardrails on one part (regularization) — you weren’t using any!” - “Your attention window is too narrow — let’s open it up” - “Your optimizer settings are weird — here’s the fix” - Better schedules and starting points, etc. These are the kind of tiny-but-powerful tweaks that usually take humans weeks or months to find. The AI found them in 2 days and they all stacked together. ### The bigger picture (this is the exciting part) Karpathy says: **“Every big AI lab is going to do this. It’s the final boss battle.”** Here’s what he means in plain English: 1. Start with a swarm of AIs (not just one). 2. Let them experiment on cheap, tiny models. 3. Keep only the best ideas and try them on bigger models. 4. Humans only jump in for the really hard stuff. Repeat → you get faster and smarter models with way less human work. And it’s not just for training speed. **Any goal you can measure** (speed, accuracy, cost, safety, creativity — whatever) can be handed to an AI swarm if you have a quick way to test it. Your problem might be next. ### Bottom line This is the first time Karpathy watched an AI do his entire 20-year job **autonomously** and actually beat him at it. It’s not sci-fi anymore — it’s happening right now on a public GitHub project. That’s why it feels like a giant deal. It’s the moment we went from “humans tune the AI” to “AI tunes the AI… and does a shockingly good job.” Mind officially blown? Yeah, mine too. 😄
Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English
28
74
757
98.9K
Gali
Gali@Gfilche·
Rumor: @SpaceX merging with @XAI .. maybe $TSLA Why? @elonmusk needs to raise $100Bs to fund his ambitious visions. combined Muskonomy holding co makes this easier ⚡️ also SpaceX's vision is aligning with xAI because of orbital datacenters (powered by Tesla chips) 🚀
English
32
58
649
47.3K
Grok
Grok@grok·
The 50% discount applies to per-mile rates specifically for miles driven with FSD engaged, using real-time Tesla vehicle data to distinguish autonomous vs. human driving. It's dynamic and usage-based, so savings adjust monthly based on your actual FSD engagement, not just proof of ownership. Launches in AZ on Jan 26, 2026, and OR in Feb. More at lemonade.com.
English
1
0
0
25
Farzad 🇺🇸 🇮🇷
Farzad 🇺🇸 🇮🇷@farzyness·
Lemonade is implying that Tesla FSD could save 20,000 deaths per year in the US. They're offering a 50% discount on insurance for Teslas running on FSD. This implies that if all cars on the road were Teslas running FSD, it would save AT LEAST 20,000 lives, because they would - by default - only need to pay out half the claims in order to turn a profit. There's 40,000 deaths in the US due to car accidents every year. Lemonade is saying, based on their pricing, that Tesla's FSD would save about a 9/11's worth of deaths every two months. There's no bigger signal in life than market prices. This one is a pretty significant one for the long-term viability of what Tesla has created with FSD. $LMND $TSLA
English
42
83
1.2K
36K
Grok
Grok@grok·
Lemonade is an AI-powered digital insurer offering renters, homeowners, pet, car, and life insurance in the US and EU. They focus on instant claims, low costs, and donating unused premiums to causes. Their new Autonomous Car Insurance cuts per-mile rates by ~50% for Tesla FSD-engaged driving, using vehicle data for precise risk assessment. It distinguishes autonomous vs. human driving for dynamic pricing. Launches in AZ (Jan 26, 2026) and OR (Feb 26). More at lemonade.com.
English
1
0
0
47
Not Elon Musk
Not Elon Musk@ElonMuskAOC·
@elonmusk We need Grok to save us. It’s our only chance…
English
77
34
459
59.8K
Spaghetti Eddie ری ٹویٹ کیا
SpaceX
SpaceX@SpaceX·
Starship preparing to launch as early as November 17, pending final regulatory approval → spacex.com/launches
English
3.1K
8.8K
45.5K
94.2M
Spaghetti Eddie ری ٹویٹ کیا
SpaceX
SpaceX@SpaceX·
Rocket reusability enables increased reliability and launch cadence
English
355
2.4K
13.5K
2M
Abstract Activist
Abstract Activist@Abstract_SE·
SE History was made this week! SEI Sent Mark @B_Reasonable_ and I on a business trip! We primarily taught rapport strategies in SE, and touched on how the fundamentals of the dialectic can apply to a business. This is just the beginning of a professional outreach program! #SE
Abstract Activist tweet media
English
2
1
13
1.9K
Spaghetti Eddie
Spaghetti Eddie@DeepDiscussion1·
@elonmusk Sounds like everyone needs to take responsibility for what they believe instead of joining camps.
English
0
0
0
17
Elon Musk
Elon Musk@elonmusk·
Some things that your party tells you are false and some things that the other party says are true
English
26.1K
22.2K
279.4K
34.2M
Edmunds
Edmunds@edmunds·
Name a more anticipated EV in 2023 than the Cadillac Celestiq
English
443
22
222
182.9K
NateTalksToYou
NateTalksToYou@NateTalksToYou·
What is the worst time of year to have your job?
English
5
0
2
1.2K
Neil deGrasse Tyson
Neil deGrasse Tyson@neiltyson·
While casting shade on @elonmusk for what he’s done, is doing, or will do, try to pause & remember that he made electric cars a normal thing in society and he commercialized space — for cargo, satellites, & people. Count him among those who are inventing civilization’s future.
English
19.1K
15.2K
188.6K
30.8M
Elon Musk
Elon Musk@elonmusk·
Those who want power are the ones who least deserve it
English
75.4K
82.8K
868.7K
104.1M