Chess Stetson
63 posts

Chess Stetson retweetet

In era of pretraining, what mattered was internet text. You'd primarily want a large, diverse, high quality collection of internet documents to learn from.
In era of supervised finetuning, it was conversations. Contract workers are hired to create answers for questions, a bit like what you'd see on Stack Overflow / Quora, or etc., but geared towards LLM use cases.
Neither of the two above are going away (imo), but in this era of reinforcement learning, it is now environments. Unlike the above, they give the LLM an opportunity to actually interact - take actions, see outcomes, etc. This means you can hope to do a lot better than statistical expert imitation. And they can be used both for model training and evaluation. But just like before, the core problem now is needing a large, diverse, high quality set of environments, as exercises for the LLM to practice against.
In some ways, I'm reminded of OpenAI's very first project (gym), which was exactly a framework hoping to build a large collection of environments in the same schema, but this was way before LLMs. So the environments were simple academic control tasks of the time, like cartpole, ATARI, etc. The @PrimeIntellect environments hub (and the `verifiers` repo on GitHub) builds the modernized version specifically targeting LLMs, and it's a great effort/idea. I pitched that someone build something like it earlier this year:
x.com/karpathy/statu…
Environments have the property that once the skeleton of the framework is in place, in principle the community / industry can parallelize across many different domains, which is exciting.
Final thought - personally and long-term, I am bullish on environments and agentic interactions but I am bearish on reinforcement learning specifically. I think that reward functions are super sus, and I think humans don't use RL to learn (maybe they do for some motor tasks etc, but not intellectual problem solving tasks). Humans use different learning paradigms that are significantly more powerful and sample efficient and that haven't been properly invented and scaled yet, though early sketches and ideas exist (as just one example, the idea of "system prompt learning", moving the update to tokens/contexts not weights and optionally distilling to weights as a separate process a bit like sleep does).
Prime Intellect@PrimeIntellect
Introducing the Environments Hub RL environments are the key bottleneck to the next wave of AI progress, but big labs are locking them down We built a community platform for crowdsourcing open environments, so anyone can contribute to open-source AGI
English

@thefolake I couldn't believe your sound at the @GEANCOFDN gala last night! Like King Sunny Ade mixed with Animals as Leaders. You shredded.

English

@shaunmmaguire Tell us if it manages to walk around spilled drinks
English

@chamath The US definitely needs to focus more on scientific achievement, but science is a friendly competition. I'm at CVPR right now and seeing the amazing things Chinese (among other) researchers are doing pumps me up.
English

These meetups have gotten really active! Hope to see all our AI colleagues (Big AI Meetup, AI LA and everyone else) again next Monday for the monthly AI meetup at King's Row. @dRISK_ai
tinyurl.com/3v8udf7x
meetup.com/pasadena-big-d… #Meetup via @Meetup
English


@wdavidmarx @TheAtlantic What you write about cultural arbitrage is, I think, what people used to just call "trade," albeit just in cool stuff in your case. But you may be hinting at something bigger...that all trade could just be in information.
English

I wrote a piece for @TheAtlantic about how, just like we've seen with financial arbitrage, the internet makes it a lot harder to engage in "cultural arbitrage" and why that may be contributing to the feeling of cultural stasis
theatlantic.com/culture/archiv…
English

Big "Big AI" meetup today, at King's Row in Old Town. Come chat about techniques for deploying AI on real business data, and socialize with practitioners. Can't wait to see ya'.
meetup.com/pasadena-big-d…
English

Estoy muy orgulloso de mi esposa Jennifer Stetson & my bro @TuPaco_Farias on the release of their film The Long Game, with Jay Hernandez, an amazing cast, and even Dennis Quaid. youtube.com/watch?v=1-MT3y…

YouTube
English

@nithyavraman Nice to have a roof over one's head; too bad not everyone does
English

@nithyavraman @latimes Big fan of what you're doing. IMO a focus on fixing homelessness LA will also help us improve everything from housing affordability to mental health. You've got my vote.
English

Thank you to the @latimes for such a strong, moving endorsement of my re-election campaign. What an unbelievable honor it is to have earned the endorsement of the paper I start my day with.
L.A. Times Opinion@latimesopinion
Endorsement: The Times' recommendation for Los Angeles City Council District 4 (via @latimesopinion) latimes.com/opinion/story/…
English








