Jason Wolfe
1.4K posts

Jason Wolfe
@w01fe
alignment and the model spec @OpenAI (opinions are my own)





I’m probably going to be hiring at least 1-2 people to join me in future exercises like this. Reach out at david@metr.org if you're a high-integrity, scrappy, creative, security+LLM researcher For more detail, see METR's Frontier Risk Report, Appendix B #anthropic" target="_blank" rel="nofollow noopener">metr.org/blog/2026-05-1…

(4) IMO, any “reasonable” civilization would clearly be taking things much more slowly and carefully with AI. The benefits of getting upsides of advanced AI a little faster are small compared to the risks of getting it irrecoverably wrong, and we could lower these risks by going slower

Could an AI company lose control of its own agents? To find out, Anthropic, Google, Meta, and OpenAI let us (1) test their best internal models with CoT access, (2) review non-public info about capabilities, alignment, and control. The result: our first Frontier Risk Report.


Could an AI company lose control of its own agents? To find out, Anthropic, Google, Meta, and OpenAI let us (1) test their best internal models with CoT access, (2) review non-public info about capabilities, alignment, and control. The result: our first Frontier Risk Report.


OpenAI would support the creation of a global governance body for artificial intelligence led by the U.S. and including China as a member, a top company executive said, hours before the start of President... claimsjournal.com/news/national/…

OpenAI is endorsing both KOSA (!) and Illinois' SB315 today, a frontier AI bill that mirrors the NY and Cali approaches OpenAI previously endorsed. In: state consistency, out: praying hopelessly for a federal standard






We've published a short summary of our monitoring research agenda: apolloresearch.ai/products/a-sca… 1. Build better evaluation datasets for monitoring 2. Automated red-teaming 3. Adversarial training at large scale We're hiring for applied control researchers: jobs.lever.co/apolloresearch…



@AlisonSomin find yourself a girl who can name at least three court decisions that she 1) hates and 2) thinks were rightly decided as a matter of law. this is a serious recommendation







