cloud

4.2K posts

cloud banner
cloud

cloud

@cloud11665

robotics inference @openai

SF Katılım Temmuz 2017
2.1K Takip Edilen15.5K Takipçiler
cloud
cloud@cloud11665·
@SonglinYang4 I really enjoy the JS branded anime girl OC
English
1
0
12
1.7K
Tenobrus
Tenobrus@tenobrus·
it looks like 5.4 actually already appreciated the creature-cluster basically just as much... but not 5.2! the goblins crept in recently
Tenobrus tweet mediaTenobrus tweet media
English
3
0
90
5.8K
Tenobrus
Tenobrus@tenobrus·
"strongly bullish on goblins"
Tenobrus tweet media
English
41
35
755
37K
cloud
cloud@cloud11665·
Those little post-training goblins won this one…
English
0
0
14
839
cloud retweetledi
rahul
rahul@rahulgs·
GPT-5.5 is ~39% cheaper than Opus 4.7, across merged PRs bucketed by diff size in Inspect despite the higher output token cost, 5.5 is cheaper for input tokens (cache writes are free), more token efficient, and tokenizes the same text to fewer tokens
rahul tweet media
English
35
62
1.1K
135.1K
cloud
cloud@cloud11665·
@punished_teno What area/team would you be interested in?
English
0
0
3
84
cloud
cloud@cloud11665·
@vikhyatk you need to be thinking in petaflops 😭😭😭
English
2
0
9
164
vik
vik@vikhyatk·
@cloud11665 lots of low hanging fruit, not that hard to beat MLX
English
1
0
8
289
cloud retweetledi
Sam Altman
Sam Altman@sama·
We tried a new thing with NVIDIA to roll out Codex across a whole company and it was awesome to see it work. Let us know if you'd like to do it at your company!
Sam Altman tweet media
English
481
423
8.2K
1M
cloud
cloud@cloud11665·
@thsottiaux We should do that more often in MB0…
English
0
0
0
378
Tibo
Tibo@thsottiaux·
Stay tuned, we are rebooting our office WiFi.
English
390
139
4.6K
371.7K
cloud
cloud@cloud11665·
5.5
QST
0
0
9
843
ueaj
ueaj@_ueaj·
@tszzl @jxnlco every time an OAI employee copeposts in public they add another flop to the training run, they've trained 3 mythoi already
English
2
1
38
2.1K
cloud
cloud@cloud11665·
@tenderizzation @ezyang If we’re forced on stock FSDP then the logical answer is the setup with the best interconnect so the NVL72 GB200, that’s the only thing that matters, we most likely won’t be memory bound nor flops bound
English
1
0
2
141
tender
tender@tenderizzation·
@ezyang x86 H100 NVL8 vs. GB200 NVL72 hmm
Nederlands
2
0
10
609
Edward Z. Yang
Edward Z. Yang@ezyang·
You're choosing between a cluster of H100s and GB200s and need to train llama 70B from scratch at 4K batch size (yeah, yeah ancient stuff lol). But your codebase only supports FSDP for scaling. Which cluster will let you scale to more GPUs?
English
13
1
78
32.6K