Anton Milan

2.5K posts

Anton Milan

Anton Milan

@antonmil

computer vision, deep learning, robotics...

34.9290° S, 138.6010° E Присоединился Temmuz 2009
1.3K Подписки1.9K Подписчики
Anton Milan ретвитнул
FalkTG 10k 🦅🇪🇺🇩🇪🇺🇦
Wenn ich ein kluger Linker wäre, würde ich die Ukraine 🇺🇦 unterstützen, weil sie sich in einer humanitären Krise befindet. Wenn ich ein kluger Bürgerlicher wäre, würde ich die Ukraine 🇺🇦 unterstützen, weil sie die Freiheit Europas verteidigt. Wenn ich ein kluger Rechter wäre, würde ich die Ukraine 🇺🇦 unterstützen, weil ein Beitritt der Ukraine zur EU die Erweiterung eines von Deutschland dominierten Wirtschaftsraums bedeutet. Nur wenn ich ein Vollidiot wäre, würde ich die Ukraine vielleicht nicht unterstützen.
Deutsch
1.1K
1.2K
8.5K
195.1K
Anton Milan
Anton Milan@antonmil·
🚀 Get ready to build anomaly detection models that actually work in production! And win a share of $25,500 USD total prize pool! VAND 4.0 @CVPR 2026. Participate in the Kaputt2 Challenge! sites.google.com/view/vand4-cvp…
English
0
3
8
2.7K
Anton Milan ретвитнул
Lenny Rachitsky
Lenny Rachitsky@lennysan·
"Using coding agents well is taking every inch of my 25 years of experience as a software engineer, and it is mentally exhausting. I can fire up four agents in parallel and have them work on four different problems, and by 11am I am wiped out for the day. There is a limit on human cognition. Even if you're not reviewing everything they're doing, how much you can hold in your head at one time. There's a sort of personal skill that we have to learn, which is finding our new limits. What is a responsible way for us to not burn out, and for us to use the time that we have?" @simonw
Lenny Rachitsky@lennysan

"Using coding agents well is taking every inch of my 25 years of experience as a software engineer." Simon Willison (@simonw) is one of the most prolific independent software engineers and most trusted voices on how AI is changing the craft of building software. He co-created Django, coined the term "prompt injection," and popularized the terms "agentic engineering" and "AI slop." In our in-depth conversation, we discuss: 🔸 Why November 2025 was an inflection point 🔸 The "dark factory" pattern 🔸 Why mid-career engineers (not juniors) are the most at risk right now 🔸 Three agentic engineering patterns he uses daily: red/green TDD, thin templates, hoarding 🔸 Why he writes 95% of his code from his phone while walking the dog 🔸 Why he thinks we're headed for an AI Challenger disaster 🔸 How a pelican riding a bicycle became the unofficial benchmark for AI model quality Listen now 👇 youtu.be/wc8FBhQtdsA

English
565
702
6.9K
1.9M
Anton Milan
Anton Milan@antonmil·
AI just wants to be free.
English
1
0
1
72
Anton Milan
Anton Milan@antonmil·
@LiorOnAI I really like the direction and I'm sure we'll see many surprises very soon. So far, though, I haven't seen anything excited beyond standard AutoML/HPO type of thing. Does anyone disagree?
English
0
0
0
53
Lior Alexander
Lior Alexander@LiorOnAI·
It's over. Karpathy just open-sourced an autonomous AI researcher that runs 100 experiments while you sleep. You don't write the training code anymore. You write a prompt that tells an AI agent how to think about research. The agent edits the code, trains a small language model for exactly five minutes, checks the score, keeps or discards the result, and loops. All night. No human in the loop. That fixed five-minute clock is the quiet genius. No matter what the agent changes, the network size, the learning rate, the entire architecture, every run gets compared on equal footing. This turns open-ended research into a game with a clear score: - 12 experiments per hour, ~100 overnight - Validation loss measures how well the model predicts unseen text - Lower score wins, everything else is fair game The agent touches one Python file containing the full training recipe. You never open it. Instead, you program a markdown file that shapes the agent's research strategy. Your job becomes programming the programmer, and this unlocks a strange new loop: 1. Agents run real experiments without supervision 2. Prompt quality becomes the bottleneck, not researcher hours 3. Results auto-optimize for your specific hardware 4. Anyone with one GPU can run a research lab overnight The best AI labs won't just have the most compute. They'll have the best instructions for agents who never sleep, never forget a failed experiment, and never stop iterating.
Andrej Karpathy@karpathy

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autor… Part code, part sci-fi, and a pinch of psychosis :)

English
135
435
4.3K
878.7K
Anton Milan
Anton Milan@antonmil·
@allTheYud I tried Connect four - these models are completely hopeless in such simple games. I'm totally baffled.
English
0
0
1
24
Anton Milan
Anton Milan@antonmil·
@karpathy I suppose this type of benchmark will become really interesting once it's scaled to many diverse tasks.
English
0
0
0
57
Andrej Karpathy
Andrej Karpathy@karpathy·
sorry just to clarify - the real benchmark of interest is: "what is the research org agent code that produces improvements on nanochat the fastest?" this is the new meta.
English
61
47
1.1K
154.8K
Andrej Karpathy
Andrej Karpathy@karpathy·
nanochat now trains GPT-2 capability model in just 2 hours on a single 8XH100 node (down from ~3 hours 1 month ago). Getting a lot closer to ~interactive! A bunch of tuning and features (fp8) went in but the biggest difference was a switch of the dataset from FineWeb-edu to NVIDIA ClimbMix (nice work NVIDIA!). I had tried Olmo, FineWeb, DCLM which all led to regressions, ClimbMix worked really well out of the box (to the point that I am slightly suspicious about about goodharting, though reading the paper it seems ~ok). In other news, after trying a few approaches for how to set things up, I now have AI Agents iterating on nanochat automatically, so I'll just leave this running for a while, go relax a bit and enjoy the feeling of post-agi :). Visualized here as an example: 110 changes made over the last ~12 hours, bringing the validation loss so far from 0.862415 down to 0.858039 for a d12 model, at no cost to wall clock time. The agent works on a feature branch, tries out ideas, merges them when they work and iterates. Amusingly, over the last ~2 weeks I almost feel like I've iterated more on the "meta-setup" where I optimize and tune the agent flows even more than the nanochat repo directly.
Andrej Karpathy tweet media
English
338
563
6.5K
624.7K
Anton Milan ретвитнул
Joakim 🌹🇳🇴🇪🇺
Joakim 🌹🇳🇴🇪🇺@joakial_·
New ad by the Norwegian Consumer Council: "A Day in the Life of an Ensh*ttificator"
English
219
3.2K
17.7K
2.4M
Anton Milan
Anton Milan@antonmil·
So, when will we see pirate bay for agent skills, or robot skills?
English
0
0
0
89
Anton Milan
Anton Milan@antonmil·
@petergyang Overall though, I wanted to get my kid to unfold his phantasy, come up with really cool new ideas, but it was more about replicating what he saw elsewhere. I'll keep working on it 😀
English
0
0
0
2
Anton Milan
Anton Milan@antonmil·
@petergyang Next he wanted to basically just recreate Super Mario Kart. Claude failed miserably, like really bad. I tried fixing but nothing helped.
English
1
0
0
46
Peter Yang
Peter Yang@petergyang·
I can't stop building games with my 7-year-old using Claude Code. Our latest is a retro pixel space shooter: → 3 worlds (space, desert, hell) → Boss battles at the end of each level → Power-ups and wave-based enemies We're living in a world where anyone can rebuild the games they loved as a kid with AI. 📌 Play it here: space-pixel-shooter.vercel.app
English
121
55
980
78.3K
Anton Milan
Anton Milan@antonmil·
I just discovered an interesting use case for GenAI while strolling through a museum 🙃
Anton Milan tweet media
English
0
0
1
75
Anton Milan ретвитнул
Harlan Stewart
Harlan Stewart@HumanHarlan·
Wait so Google patented the Transformer architecture and then, instead of enforcing the patent, just allowed its competition to grow into a trillion-dollar industry? What?
Harlan Stewart tweet media
English
171
142
4.1K
582.3K
Anton Milan ретвитнул
Энергия на шару
Энергия на шару@PowerNaShary·
Правдивое поздравление поехавшего утырка.
Русский
43
373
1.3K
56.4K
Anton Milan ретвитнул
MERICA MEMED
MERICA MEMED@Mericamemed·
The first person in the world to kick his own balls… That’s history right here!
English
748
9.6K
127.2K
3.8M