Henri Bonamy

38 posts

Henri Bonamy banner
Henri Bonamy

Henri Bonamy

@henribonamy

Building @hackiterate and AI research @centralesupelec

San Francisco, CA Katılım Nisan 2024
46 Takip Edilen95 Takipçiler
Sabitlenmiş Tweet
Henri Bonamy
Henri Bonamy@henribonamy·
ml-intern: beats claude code + fully integrated with hf to train directly from your terminal. Check it out ! Happy to have helped with it :) 🤗
Aksel@akseljoonas

Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on hf.co/spaces, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and hf.co/papers, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on hf.co/datasets - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: github.com/huggingface/ml… Web + mobile: huggingface.co/spaces/smolage… And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.

English
1
1
6
580
Henri Bonamy
Henri Bonamy@henribonamy·
moreover, the structure can reform itself following a disturbance (a "wound"), similar to how organisms self-repair still fascinated by seeing it in action (4/5) 🧵
English
1
0
0
2
Henri Bonamy
Henri Bonamy@henribonamy·
Life is weird: complex things can emerge from very basic rules this full lizard grows pretty much by itself ! about a year ago, I was looking for a cool deep learning project. I found an article about neural network cellular automata. the idea is simple: (1/5) 🧵
Henri Bonamy tweet media
English
1
0
2
11
Nate Berkopec
Nate Berkopec@nateberkopec·
I'm so sick of reading em dashes and "it's not x, it's y." I'm so sick of it, man.
English
365
271
4.6K
268.5K
Henri Bonamy
Henri Bonamy@henribonamy·
llms feel like coding superintelligence until you're every so slightly out of distribution and instantly it's like talking to a toddler
English
1
0
3
56
Henri Bonamy
Henri Bonamy@henribonamy·
@Yuchenj_UW taste is kind of verifiable. people generally agree on whether a website is beautiful or not probably true for art however, because it's beauty comes from unique ideas
English
1
0
0
69
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
AI will solve coding and math first, because the outputs are verifiable. AI won’t “solve” art, because art has no unit test. There is no single definition of good or bad. And by art, I don’t just mean paintings or music. I mean designing a great product, building a great company, and anything where taste is the moat.
English
155
35
386
26.1K
Vincent Weisser
Vincent Weisser@vincentweisser·
We are open sourcing renderers For RL, the inference server should be simple Tokens in, tokens out renderers is the token-level chat templating layer to >render messages to tokens >parse completions to structure >bridge rollouts byte-for-byte > >3x throughput on openmodels
Prime Intellect@PrimeIntellect

Introducing Renderers RL trainers work in tokens. Environments work in messages. Going back and forth corrupts sampled tokens, wasting compute on every agentic turn. With Renderers, we fix this mismatch. This unlocks >3x throughput on popular open models.

English
4
9
120
10K
Aksel
Aksel@akseljoonas·
3 weeks since ml-intern launched and we just hit 1M messages exchanged. that's 3.3 agent-years of ML research in 21 days. 2 months worth of research every day. 17,383 training jobs total. talk about AI acceleration. here's some of what people built: @cmpatino_ replicated the full DeepSeek v4 architecture and pre+post trained a 100M MoE from scratch. → huggingface.co/cmpatino/nanow… it landed a third place submission on @kellerjordan0 optimizer competition. autoresearch on SOTA territory. github.com/KellerJordan/m… @_lewtun Got the intern to convert @AlecRad's cool new talkie-lm 1930 model to work with transformers. tokenizer, chat template, model conversion etc all one-shotted by ml-intern. huggingface.co/lewtun/talkie-… someone created entire PhD dissertation chapter on context-aware agentic cyber defense drafted with 16 research subagents. and someone used it to crack an @Anthropic kernel optimization take-home. (we don't know how to feel about this one 👀 ) just getting started → huggingface.co/spaces/smolage…
English
19
17
155
34.6K
Henri Bonamy
Henri Bonamy@henribonamy·
Apparently the muon optimizer led to 25% of parameters in a model to die during training (become inactive) @tilderesearch published a blog post about training a model equivalent to qwen3 with orders of magnitude less training tokens, and 25% less parameters
Henri Bonamy tweet media
Tilde@tilderesearch

Introducing Aurora, a new optimizer for training frontier-scale models. We train Aurora-1.1B, which achieves 100x data efficiency on open-source internet data. Despite having 25% fewer parameters, 2 orders of magnitude fewer training tokens, and using fully open-source internet-only data, Aurora matches Qwen3-1.7B on several benchmarks. Aurora was developed after identifying a major failure mode that can occur under Muon, an increasingly popular optimizer that has shown strong gains over Adam(W). We find that Muon can cause a huge percentage of neurons to effectively die early in training, reducing effective network capacity so that many parameters no longer meaningfully contribute to network outputs. By redistributing update energy more uniformly across neurons while preserving Muon’s stability properties, Aurora prevents neuron death and recovers substantial model capacity. What makes this work especially exciting is that it points toward a broader direction for ML research: better optimizers may not come purely from elegant mathematical abstractions, but from understanding and addressing the concrete dynamics and pathologies that emerge inside real training systems.

English
0
0
3
165
Henri Bonamy
Henri Bonamy@henribonamy·
@willccbb very interesting article read it this morning. crazy nice visualizations
English
0
0
0
120
Carlos Miguel Patiño
Carlos Miguel Patiño@cmpatino_·
Introducing nanowhale 🐳! A tiny DeepSeek model fully pretrained by an agent. Inspired by @karpathy's nanochat, we gave ml-intern the task of training a tiny MoE with all the architectural advancements of DeepSeek v4. To test it end-to-end, it trained a 100M-parameter MoE through both pretraining and post-training.
Aksel@akseljoonas

Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on hf.co/spaces, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and hf.co/papers, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on hf.co/datasets - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: github.com/huggingface/ml… Web + mobile: huggingface.co/spaces/smolage… And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.

English
39
101
996
108.5K
Hamzé 🦀
Hamzé 🦀@Hamzeml·
Python made AI accessible. Rust can make parts of AI understandable. That’s the bet behind Category Theory for Tiny ML in Rust. We’re building tiny ML systems from first principles using: Rust types typed transformations composition training loops category theory as an engineering tool Not abstraction cosplay. Executable structure. Working draft. Public feedback welcome.
Hamzé 🦀 tweet media
English
58
326
2.4K
134.4K
Henri Bonamy
Henri Bonamy@henribonamy·
@yitong the trend barely started and I've seen enough already
English
0
0
0
1.5K
Ahmed Hassan
Ahmed Hassan@uihssn·
Addicted to ASCII backgrounds.
Ahmed Hassan tweet media
English
12
5
183
6.6K
Henri Bonamy
Henri Bonamy@henribonamy·
@parisbayarea you're absolutely right — the question isn't whether you're technical or not but how you leverage the modern AI tools 🚀
English
0
0
0
26
parisbayarea
parisbayarea@parisbayarea·
I’m techniclaude
English
2
0
7
469