will depue

8.7K posts

will depue banner
will depue

will depue

@willdepue

dei ex machina @openai, past: sora 1 & 2, posttraining o3/4o, applied research

san francisco Katılım Mayıs 2018
2.3K Takip Edilen57.4K Takipçiler
will depue retweetledi
Alex Zhao
Alex Zhao@cocohearts·
first tranche of @runpod credits should be rolling out
English
2
1
42
3K
will depue
will depue@willdepue·
@shawnbuilds @OpenAI it’s really not that hard! try grabbing the train gpt starter script and have it explain to you how it works + suggest an experiment to run. the MLX script should run on your mac and you can configure it to run in about a minute
English
0
0
16
732
shawn
shawn@shawnbuilds·
@OpenAI genuine question - how does someone with zero knowledge in this space get started attempting a problem like this? i'd imagine you'd need some serious ai fundamentals
English
3
0
9
6.9K
will depue
will depue@willdepue·
@industriaalist i’d be surprised if the search space in the limit of L(N) wasn’t equally rich as L(D)? why do you say so
English
1
0
12
2K
Samip
Samip@industriaalist·
few thoughts on openai's parameter golf: - first, you'd be surprised how many researchers at big labs (not just openai) are interested in our slowrun - i'd expect openai to be already automating parameter golf *entirely* with agents. and i'd also expect agents to be better than humans at this already. - for slowrun, we've deliberately kept it less gamified. the search space over learning algorithms for data efficiency is much larger than for compute/parameter efficiency. so slowrun is less of a competition and more of an open research effort toward interesting, new learning algorithms
Samip tweet media
English
9
14
316
22.7K
will depue
will depue@willdepue·
@artificialguybr i would be surprised if it wasnt at least somewhat of an AI <> human collaboration!
English
0
0
5
719
will depue
will depue@willdepue·
@itsandrewgao ok will see if we can get more. if you run the 1xh100 baseline just make a pr with the log and the submission and i'll add it to the non record submissions for iteration
English
1
1
42
4.4K
andrew gao
andrew gao@itsandrewgao·
nooo how am i supposed to parameter golf when there are no 8xh100s help @willdepue
andrew gao tweet media
English
13
2
149
17.2K
bilal
bilal@bilaltwovec·
@typedfemale who remembers openai requests for research 1 and 2
English
2
0
18
2K
will depue
will depue@willdepue·
Remembering Mike Lanning, who passed today, leader of the greatest Boy Scout troop in America: Troop 233. Mike was an incredible person, leader & mentor to many. He maintains the record for most Eagle Scouts from one Scoutmaster: 1000+ with Troop 223. He will be deeply missed.
will depue tweet mediawill depue tweet mediawill depue tweet mediawill depue tweet media
English
2
1
87
20.2K
will depue
will depue@willdepue·
@Yuchenj_UW thanks for sharing! i expect the best submissions to look pretty different than the nanogpt speedrun models, given parameter constraints
English
2
0
38
6.1K
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
OpenAI just dropped a training challenge: Train a <16MB language model in 10 minutes on 8×H100s and minimize held-out loss on a fixed FineWeb dataset. Basically NanoGPT Speedrun. They’re sponsoring $1M in compute. I can summon my autoresearch army to win it… if I have time.
Yuchen Jin tweet media
English
49
72
1.2K
106.1K
will depue
will depue@willdepue·
@test_tm7873 @Yuchenj_UW feel free to train and test on whatever! we just require final leaderboard submissions to be on h100s you can always share a github repo if you’re not submitting to the leaderboard (see non record submissions), just make sure it still follows the rules of fixed dataset and eval
English
1
0
1
83
testtm
testtm@test_tm7873·
@Yuchenj_UW man. its epic, i love small language models. but all my experience i have with em is on tpus. 😭and they want H100s. nnnnnooooooo.
English
2
0
21
4.9K
Andrej Karpathy
Andrej Karpathy@karpathy·
The signature is alluding to NVIDIA GTC 2015, where Jensen excitedly told an audience of, at the time, mostly gamers and scientific computing professionals that Deep Learning is The Next Big Thing, citing among other examples my PhD thesis (one of the first image captioning systems that coupled image recognition ConvNet to an autoregressive RNN language model, trained end to end). This was back when most people were still unaware and somewhat skeptical but of course - Jensen was 1000% correct, highly prescient and locked in very early.
Andrej Karpathy tweet media
English
27
48
1.2K
68.4K
Andrej Karpathy
Andrej Karpathy@karpathy·
Thank you Jensen and NVIDIA! She’s a real beauty! I was told I’d be getting a secret gift, with a hint that it requires 20 amps. (So I knew it had to be good). She’ll make for a beautiful, spacious home for my Dobby the House Elf claw, among lots of other tinkering, thank you!!
NVIDIA AI Developer@NVIDIAAIDev

🙌 Andrej Karpathy’s lab has received the first DGX Station GB300 -- a Dell Pro Max with GB300. 💚 We can't wait to see what you’ll create @karpathy! 🔗 #dgx-station" target="_blank" rel="nofollow noopener">blogs.nvidia.com/blog/gtc-2026-… @DellTech

English
495
777
17.8K
878.2K