Hen Davidov

140 posts

Hen Davidov banner
Hen Davidov

Hen Davidov

@hendav136

Rhodes Scholar, StatML CDT @UniOfOxford | @Forbes 30U30 2025 | Ex- @AmazonScience , @TechnionLive

Katılım Mart 2022
407 Takip Edilen226 Takipçiler
Sabitlenmiş Tweet
Hen Davidov
Hen Davidov@hendav136·
Recently accepted to ICML 2026: Knowing when to quit. We refuse to complete failing LLM generations as they unfold, token by token, instead of judging the prompt up front or scanning the final output. The method utilizes a 2-layer probe to estimate expected correctness.
Hen Davidov tweet media
English
4
12
59
14.1K
Hen Davidov
Hen Davidov@hendav136·
@JoshPurtell Wait but if they use the value function (critic) trained on each completed trajectory verified reward, what difference does it make the split into subtraces? Am I missing a prm here?
English
0
0
0
76
Hen Davidov
Hen Davidov@hendav136·
@yoavgo קצת יותר נמוך מהסיכוי לחנינה
עברית
0
0
0
248
Hen Davidov
Hen Davidov@hendav136·
Hen, first year PhD, pretraining an LLM with academia compute for the first time. Wish me luck.
English
21
3
301
44.2K
the_king_in_yellow
the_king_in_yellow@ThakurAryancha4·
@hendav136 Good luck dude I am also currently doing patent research in Ai specially edge ai
English
1
0
0
136
Hen Davidov
Hen Davidov@hendav136·
@ThakurAryancha4 Starting with a tiny gpt3-esque 100m param, and building up on the positive signals
English
1
0
1
709
Hen Davidov
Hen Davidov@hendav136·
@noclearnogga Let's start by changing your profile pic from a dude with a swastica on his forehead?
English
2
0
3
354
Noclear Nogga
Noclear Nogga@noclearnogga·
@hendav136 bro, im studying Physics but considering switching to CS, which i love, but I'm worried AI may automate much of what a CS degree teaches. My interests are AI theory, robotics and maybe quantum computing. If you were starting today, what would you do? I'd like working in deep tech
English
2
0
0
442
Hen Davidov
Hen Davidov@hendav136·
@JerrryKun I have a transformers arch bet in mind, hoping to make a dent in this possibly saturated line of work
English
1
0
5
1.3K
Hen Davidov
Hen Davidov@hendav136·
@gradascetic Less than you'd expect. Compute is kinda siloed off in different labs. I mainly use my lab's gpus
English
1
0
5
1.9K
edward
edward@gradascetic·
@hendav136 How much compute does ox stats have now?
English
1
0
1
2K
Hen Davidov
Hen Davidov@hendav136·
@ZacharyBamberg1 Thank you! Right now I'm just working off of a llama style recipe, iterating on my arch bet
English
0
0
1
45
Zachary Bamberger
Zachary Bamberger@ZacharyBamberg1·
@hendav136 Happy to compare notes if you’re feeling confident nonetheless 😅 (It helps to pre-train decoder-only models with existing kernels — avoid pre-training encoder-decoders like the plague)
English
1
0
1
80
Weiyang Liu
Weiyang Liu@Besteuler·
@hendav136 You can check Appendix C in our paper. The actual cost depends on whether you use 2nd momentum in Pion. Without it, Pion will be more efficient than both Muon and AdamW.
English
1
0
1
59
Weiyang Liu
Weiyang Liu@Besteuler·
🚀 We are very excited to introduce Pion — a spectrum-preserving optimizer for LLM training. Pion shows strong training stability in practice. This is the project that we have been working on since POET. The central idea is to turn POET/POET-X (spherelab.ai/poet; spherelab.ai/poetx) into an easy-to-use optimizer. Instead of additive updates like Adam & Muon, Pion takes a different route by updating each weight matrix via coupled left & right orthogonal transformations, keeping its weight spectrum stable throughout training. This different update mechanism is directly inspired by the empirical effectiveness of Orthogonal Finetuning (oft.wyliu.com; boft.wyliu.com) and POET/POET-X, with roots tracing back to minimum-energy training methods such as MHE (arxiv.org/abs/1805.09298) and OPT (opt-training.github.io). Pion's key features: ✨ Competitive on LLM pretraining, SFT & RLVR ✨ μP-compatible by construction ✨ Stabily trains ultra-deep LLMs and even normalization-free LLMs, where AdamW & Muon diverge 🌐 Project: spherelab.ai/pion 📜 Paper: arxiv.org/abs/2605.12492
Weiyang Liu tweet media
English
5
16
120
14.7K