cloud

4.2K posts

cloud

@cloud11665

robotics inference @openai

SF Katılım Temmuz 2017

2.1K Takip Edilen15.5K Takipçiler

cloud@cloud11665·1d

@SonglinYang4 I really enjoy the JS branded anime girl OC

English

1.7K

Songlin Yang@SonglinYang4·1d

bodila@51bodila

Jane Street hired this junior at $220k-$600k /year because he uses AI to analyse TRILLIONS of data in this 1-hour lecture - he show how to research trillion of data points thanks to his machine Bookmark & watch it, instead of Netflix to learn how to do the same!

ZXX

333

54.5K

cloud@cloud11665·5d

@tenderizzation 89.

263

tender@tenderizzation·5d

there are 9s using github? i would like to see them please

vogel@ryanvogel

"github just lost another 9"

English

4.4K

cloud@cloud11665·5d

@tenobrus Goblin mode

Deutsch

197

Tenobrus@tenobrus·5d

it looks like 5.4 actually already appreciated the creature-cluster basically just as much... but not 5.2! the goblins crept in recently

English

5.8K

Tenobrus@tenobrus·6d

"strongly bullish on goblins"

English

755

37K

cloud@cloud11665·6d

Those little post-training goblins won this one…

English

839

cloud@cloud11665·27 Nis

@tadasgedgaudas @jiratickets @sama Wait did I literally just do the thing, did I get meta-gamed????

English

cloud@cloud11665·27 Nis

@tadasgedgaudas @jiratickets @sama I am Polish and I don’t know…

English

cloud@cloud11665·27 Nis

@tadasgedgaudas @jiratickets @sama What does Poland have to do with this 😭

English

TadasG 💻@tadasgedgaudas·27 Nis

@jiratickets @sama some people just don't have Polish DNA and you can tell

English

cloud retweetledi

rahul@rahulgs·27 Nis

GPT-5.5 is ~39% cheaper than Opus 4.7, across merged PRs bucketed by diff size in Inspect despite the higher output token cost, 5.5 is cheaper for input tokens (cache writes are free), more token efficient, and tokenizes the same text to fewer tokens

English

1.1K

135.1K

cloud@cloud11665·27 Nis

@punished_teno What area/team would you be interested in?

English

cloud@cloud11665·26 Nis

@willdepue It should be perf@joule…

English

1.6K

will depue@willdepue·26 Nis

noam needs credit for beating this drum since the beginning but everyone needs to internalize: you cannot compare perf across inference compute budgets. it's either fixed TTC or max-compute, nothing inbetween (& at some point it should just be perf@latency, like chess engines)

Jerry Tworek@MillionInt

If anyone ever asks what does "frontier" model means, show them this picture as a definition:

English

415

51.5K

cloud@cloud11665·25 Nis

@vikhyatk you need to be thinking in petaflops 😭😭😭

English

164

vik@vikhyatk·25 Nis

@cloud11665 lots of low hanging fruit, not that hard to beat MLX

English

289

vik@vikhyatk·25 Nis

dm if you want an invite to my partiful. we sit at our own homes and write metal kernels all day

Members of Technical Staff@mots_pod

it's friday in SF which means if you don't have at least 3 partifuls lined up for both tonight and tomorrow night then it's over for you and you should probably leave the city

English

5.8K

cloud retweetledi

Sam Altman@sama·23 Nis

We tried a new thing with NVIDIA to roll out Codex across a whole company and it was awesome to see it work. Let us know if you'd like to do it at your company!

English

481

423

8.2K

cloud@cloud11665·23 Nis

@thsottiaux We should do that more often in MB0…

English

378

Tibo@thsottiaux·23 Nis

Stay tuned, we are rebooting our office WiFi.

English

390

139

4.6K

371.7K

cloud@cloud11665·23 Nis

5.5

QST

843

cloud retweetledi

ChatGPT@ChatGPTapp·23 Nis

@Layton_Gott trust the plan

English

795

51.9K

cloud retweetledi

Sam Altman@sama·21 Nis

Gabe is incredibly talented and a great leader. Happy to see this, but not surprised.

Gabriel Goh@gabeeegoooh

wow

English

190

2.2K

775.8K

cloud@cloud11665·21 Nis

@punished_teno 4-5 wtf? That’s like 10% of a year

English

cloud@cloud11665·20 Nis

@_ueaj @tszzl @jxnlco til plural of mythos is mythoi

Français

116

ueaj@_ueaj·20 Nis

@tszzl @jxnlco every time an OAI employee copeposts in public they add another flop to the training run, they've trained 3 mythoi already

English

2.1K

jason liu@jxnlco·20 Nis

Tortoise beat the hare

Can Vardar@icanvardar

codex over opus is way more fun lately

English

187

47.5K

cloud@cloud11665·19 Nis

@tenderizzation @ezyang If we’re forced on stock FSDP then the logical answer is the setup with the best interconnect so the NVL72 GB200, that’s the only thing that matters, we most likely won’t be memory bound nor flops bound

English

141

tender@tenderizzation·19 Nis

@ezyang x86 H100 NVL8 vs. GB200 NVL72 hmm

Nederlands

609

Edward Z. Yang@ezyang·19 Nis

You're choosing between a cluster of H100s and GB200s and need to train llama 70B from scratch at 4K batch size (yeah, yeah ancient stuff lol). But your codebase only supports FSDP for scaling. Which cluster will let you scale to more GPUs?

English

32.6K

Keşfet

@SonglinYang4 @tenderizzation @tenobrus @tadasgedgaudas @jiratickets @sama @punished_teno @willdepue