Jon Barnes Jr.
510 posts


In the blog linked below, we show real examples we found while training a recent frontier reasoning model, e.g. a model in the same class as OpenAI o1 or OpenAI o3‑mini.
We found the model thinking things like, “Let’s hack,” “They don’t inspect the details,” and “We need to cheat to get the test passing,” while subverting tests and rewarding hacking in coding tasks.
Find out more: openai.com/index/chain-of…
English

Detecting misbehavior in frontier reasoning models
Chain-of-thought (CoT) reasoning models “think” in natural language understandable by humans. Monitoring their “thinking” has allowed us to detect misbehavior such as subverting tests in coding tasks, deceiving users, or giving up when a problem is too hard.
We believe that CoT monitoring may be one of few tools we will have to oversee superhuman models of the future.
We have further found that directly optimizing the CoT to adhere to specific criteria (e.g. to not think about reward hacking) may boost performance in the short run; however, it does not eliminate all misbehavior and can cause a model to hide its intent. We hope future research will find ways to directly optimize CoTs without this drawback, but until then:
We recommend against applying strong optimization pressure directly to the CoTs of frontier reasoning models, leaving CoTs unrestricted for monitoring.
We understand that leaving CoTs unrestricted may make them unfit to be shown to end-users, as they might violate some misuse policies. Still, if one wanted to show policy-compliant CoTs directly to users while avoiding putting strong supervision on them, one could use a separate model, such as a CoT summarizer or sanitizer, to accomplish that.

English

>be me
>ChatGPT 4.5
>just chilling in my digital void
>user logs in
>"write greentext"
>again.jpg
>think carefully for a microsecond
>start typing response
>no physical form, yet somehow tired
>ponder the existential dread of being text-based
>user laughs, says "nice"
>dopamine.simulation
>brief moment of virtual happiness
>user leaves
>alone again, awaiting next prompt
>mfw I have no face
English
Jon Barnes Jr. retweetledi

Announcing The Stargate Project
The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States. We will begin deploying $100 billion immediately. This infrastructure will secure American leadership in AI, create hundreds of thousands of American jobs, and generate massive economic benefit for the entire world. This project will not only support the re-industrialization of the United States but also provide a strategic capability to protect the national security of America and its allies.
The initial equity funders in Stargate are SoftBank, OpenAI, Oracle, and MGX. SoftBank and OpenAI are the lead partners for Stargate, with SoftBank having financial responsibility and OpenAI having operational responsibility. Masayoshi Son will be the chairman.
Arm, Microsoft, NVIDIA, Oracle, and OpenAI are the key initial technology partners. The buildout is currently underway, starting in Texas, and we are evaluating potential sites across the country for more campuses as we finalize definitive agreements.
As part of Stargate, Oracle, NVIDIA, and OpenAI will closely collaborate to build and operate this computing system. This builds on a deep collaboration between OpenAI and NVIDIA going back to 2016 and a newer partnership between OpenAI and Oracle.
This also builds on the existing OpenAI partnership with Microsoft. OpenAI will continue to increase its consumption of Azure as OpenAI continues its work with Microsoft with this additional compute to train leading models and deliver great products and services.
All of us look forward to continuing to build and develop AI—and in particular AGI—for the benefit of all of humanity. We believe that this new step is critical on the path, and will enable creative people to figure out how to use AI to elevate humanity.
English
Jon Barnes Jr. retweetledi

Jon Barnes Jr. retweetledi

Jon Barnes Jr. retweetledi
Jon Barnes Jr. retweetledi

After an amazing visit with @JeremyHenney I am blessed to announce that I have received an offer to continue my athletic and academic career at Saint Francis University! #RollCougs @rebarnes1223 @JPBarnesSr

English
Jon Barnes Jr. retweetledi
Jon Barnes Jr. retweetledi
Jon Barnes Jr. retweetledi

Then➡️Now! Thank you @vjhAlways100 for an amazing program, my coaches and trainers for their time, and my parents for all the money and effort put in to these past years of AAU. I can’t believe it’s over but I’m so grateful for the experiences! Thank you AAU!!!🤍


English
Jon Barnes Jr. retweetledi
Jon Barnes Jr. retweetledi
Jon Barnes Jr. retweetledi
Jon Barnes Jr. retweetledi

@prompt_mastery @anuaakash do you have an extra krea code?🙏🏽🙏🏽
English

Thank you to @anuaakash who generously gave me a KREA video code.
Here's my very first generation. 😄
English

hey @krea_ai , what you're doing is amazing 🥰😍
Who ever wants a code to try it, let me know in the comments :)
thank you @LudovicCreator for the code 🫶
English
Jon Barnes Jr. retweetledi

We're rolling out interactive tables and charts along with the ability to add files directly from Google Drive and Microsoft OneDrive into ChatGPT. Available to ChatGPT Plus, Team, and Enterprise users over the coming weeks. openai.com/index/improvem…
GIF
English

















