
Nick Levine
686 posts

Nick Levine
@status_effects
training vintage language models



we've done such an objectively terrible job at explaining ourselves at sfcompute, this is the year we fix that







I wanted to play with the Talkie 1930 models, but they weren't packaged in a convenient transformers format, so I had codex convert them. They can also now be used with vllm transformers backend. Here they are, in case it's useful to anyone else: huggingface.co/collections/xl…









Surely a model pre-trained on the web would fare much better? Yes, and no. We also fine-tune their web-retrained model, and observe a modest +1% solve-rate on SWE-bench, achieving 5.7% pass@1 compared to 4.5% Surprisingly little seems to be lost by throwing away the internet.

The "Tulsa Race Massacre" was effectively invented, in the public imagination, in October 2019, when the show Watchmen premiered with a fictionalized and exaggerated depiction of innocent blacks being bombed by whites. The term was practically nonexistent before then.










