Paul retweetledi

Speedrunning GPT-2 is now routine thanks to @karpathy.
But can we speedrun GPT3-175B?
We attempted to match accuracy on a <$10K budget; while we didn't quite reach it, our first results show that quality data, engineering, and native FP4 can get close.
Details in 🧵

English










