Mark Chen

429 posts

Mark Chen

@markchen90

Chief Research Officer at @OpenAI. Coach for the USA IOI Team.

Katılım Haziran 2020

349 Takip Edilen71.5K Takipçiler

Sabitlenmiş Tweet

Mark Chen@markchen90·17 Eyl

We wrapped up this year's competition circuit with a full score on the ICPC, after achieving 6th in the IOI, a gold medal at the IMO, and 2nd in the AtCoder Heuristic contest!

Mostafa Rohaninejad@MostafaRohani

1/n I’m really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the 2025 ICPC World Finals, the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have placed it first among all human participants. 🥇🥇

English

799

467.4K

Mark Chen@markchen90·3d

This is just one eval, but it's an important one - UK AISI’s cyber range tests long-horizon, agentic capability. 5.5 performs similarly to Mythos. The risks for frontier models are real. But we do our best to deploy AI people can actually use - through hard work on mitigations.

AI Security Institute@AISecurityInst

OpenAI’s GPT-5.5 is the second model to complete one of our multi-step cyber-attack simulations end-to-end 🧵

English

283

18.1K

Mark Chen retweetledi

Sebastien Bubeck@SebastienBubeck·25 Nis

When GPT-5.5 misses on a Frontier Math question

English

314

52.2K

Mark Chen retweetledi

Sam Altman@sama·23 Nis

We tried a new thing with NVIDIA to roll out Codex across a whole company and it was awesome to see it work. Let us know if you'd like to do it at your company!

English

481

423

8.2K

Mark Chen@markchen90·23 Nis

And he’s doing a phenomenal job!

Ashlee Vance@ashleevance

One of the biggest things I took away from this interview was that @gdb is back in a major way setting the strategy for @OpenAI. Full pod right here - corememory.com/p/the-great-re…

English

283

26.4K

Mark Chen retweetledi

Eric@ericmitchellai·21 Nis

why isn't chatgpt the perfect personal AGI? what is most disappointing about it? what feature, model improvement, or bugfix would do the most to make it more useful in your daily life? what is most frustrates you that chatty can't do, or can't do well enough?

English

313

315

57.9K

Mark Chen@markchen90·18 Nis

Not true. Science is more important than ever. The future can’t be about dumping results on the community en masse. We need to work with scientists to use AI to accelerate discovery without stripping away the artistry. Excited for @SebastienBubeck and @ahelkky (who are both amazing scientists) to take on this mandate!

Hayden Field@haydenfield

VP of Science leaves OpenAI. The company has stated in recent weeks that it's shifting its focus to coding and enterprise rather than "side quests" and has shut down existing tools like Sora & pending features like erotica. Looks like science dept was next on the chopping block.

English

708

97.9K

Mark Chen@markchen90·18 Nis

@RonConway @garrytan Wishing you the best! Rooting for you the way you rooted for us!

English

1.1K

Ron Conway@RonConway·18 Nis

I want to share some difficult news. I was recently diagnosed with a rare form of cancer and I want you to hear it directly from me. Treatment is starting immediately and will include multiple strategies over the course of about a year. While I will be stepping back from some of my usual activities, I will continue to support SV Angel founders, who I love with a passion. SV Angel remains unchanged. Topher has made all of our investment decisions for the better part of the last decade, and Ronny joined as Managing Partner in 2024. They bring experience from nearly every major technology cycle in Silicon Valley and are now focused on partnering with founders building the future of AI. SV Angel has a deep, experienced team that remains fully focused on supporting exceptional founders. With a more focused and balanced schedule, I can prioritize treatments while helping SV Angel founders at inflection points like we always do! I’ve chosen not to share the specific type of cancer since I don't want speculation about my prognosis. I appreciate your understanding and respect for this. I am optimistic about my prognosis. I am fortunate to have the best/amazing team of UCSF doctors in San Francisco, and as you know, I never back down from a fight. Thank you for your support, it means a great deal to me.

English

584

4.3K

458.2K

Mark Chen@markchen90·17 Nis

@kevinweil @kevinweil You've been an incredibly important part of OpenAI and in getting our science work off the ground. Thank you for shaping how we work with the scientific community and for pushing the scientific frontier. I'm very grateful for our time working together!

English

292

15.3K

Kevin Weil 🇺🇸@kevinweil·17 Nis

Today is my last day at OpenAI, as OpenAI for Science is being decentralized into other research teams. It’s been a mind-expanding two years, from Chief Product Officer to joining the research team and starting OpenAI for Science. Accelerating science will be one of the most stunningly positive outcomes of our push to AGI, and I’m rooting for @sama @markchen90 @fidjissimo @gdb @merettm and the whole team!

English

283

145

4.3K

588.2K

Mark Chen retweetledi

M@mh012012·9 Nis

Copy of FreeBSD from Jan 1 2026 (68d6abd9714384a41028dc0d5086b4930366bbea), then prompted GPT-5.4 with a similar prompting strategy to the Mythos red team harness from their whitepaper, via OpenCode. This reproduces. Going to attempting reproduction for the other bugs they disclosed. Concerning; maybe the only insight from the Mythos whitepaper is that they were willing to spend millions on compute to do this for a bunch of open source. But they could have saved millions by just using Opus; Mythos had little to do with it.

English

145

34.9K

Mark Chen retweetledi

Jacob Effron@jacobeffron·9 Nis

At @OpenAI, Chief Scientist @merettm helps lead the research roadmap to AGI including a research intern-level AI system by September 2026 and a fully automated AI researcher by March 2028. I sat down with Jakub to check on those timelines and ask him all of my top-of-mind AI questions including: ▪️ How OpenAI thinks about extending RL beyond code and math ▪️ The current state of alignment research as more powerful models loom ▪️ The future of continual learning ▪️ How startups should think about building their own models/harnesses And he also shared some great stories around OpenAI’s pioneering work on math. YouTube: youtu.be/vK1qEF3a3WM Spotify: bit.ly/4sjUyrN Apple: bit.ly/41jAdrN 0:00 Intro 1:53 Research Intern Capability Timelines 4:59 Math Breakthroughs 7:59 RL Beyond Verifiable Tasks 12:32 RL vs In-Context 19:01 Allocating Compute Internally 28:18 AI for Science 31:40 Pattern Matching 33:23 Solving the Hardest Math Problems 37:40 Chain of Thought Monitoring 44:33 Generalization and Value Alignment in Models 47:57 Inside OpenAI 51:55 Quickfire

YouTube

English

622

136.8K

Mark Chen@markchen90·6 Nis

We’re excited to launch the OpenAI Safety Fellowship - supporting rigorous, independent research on AI safety and alignment, including areas like evaluation, robustness, and scalable mitigations. Applications are open through May 4, 2026!

OpenAI@OpenAI

Introducing the OpenAI Safety Fellowship, a new program supporting independent research on AI safety and alignment—and the next generation of talent. openai.com/index/introduc…

English

504

63.2K

Mark Chen retweetledi

OpenAI@OpenAI·6 Nis

Introducing the OpenAI Safety Fellowship, a new program supporting independent research on AI safety and alignment—and the next generation of talent. openai.com/index/introduc…

English

389

300

2.7K

940.9K

Mark Chen@markchen90·31 Mar

Really proud of how our auto compaction turned out. I hope you notice a clear difference in how long Codex stays coherent!

alex fazio@alxfazio

it’s insane how codex remembers tiny details across multiple rounds of compaction

English

900

64.8K

Mark Chen retweetledi

Michelle Pokrass@michpokrass·17 Mar

we shipped a new version of 5.3 instant to chatgpt yesterday. 5.3 was unintentionally pretty annoyingly clickbait-y. it's better in yesterday's model and we're going to keep stamping that behavior out. keep the feedback coming! help.openai.com/en/articles/68…

English

456

60K

Mark Chen@markchen90·14 Mar

Insane how leaky OpenAI is smh

Tibo@thsottiaux

@SIGKITTEN How about we put codex into ChatGPT and then ChatGPT into the Codex that is within ChatGPT

English

923

188.5K

Mark Chen retweetledi

Tibo@thsottiaux·9 Mar

@0thernet Just use Codex. That might have been a single prompt and worked within your $20 sub

English

921

44.2K

Mark Chen@markchen90·7 Mar

If you give GPT-5.4 a raw dump of the GPT-2 weights and ask for a <5000 byte C program to inference it, GPT-5.4 succeeds in under 15 minutes! I remember working on a similar exercise to compare results against a proprietary model in a previous paper - it took days!

Hanson Wang@hansonwng

x.com/i/article/2029…

English

625

81.5K

Mark Chen retweetledi

Aidan McLaughlin@aidan_mclau·6 Mar

research has really been cooking with gas lately. feels like playing with a great orchestra

English

415

20.4K

Mark Chen retweetledi

Daniel Litt@littmath·21 Şub

Some thoughts on AI and mathematics, inspired by "First Proof."

English

200

1.1K

334.3K

Mark Chen retweetledi

Jakub Pachocki@merettm·14 Şub

Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The problems require expertise in their respective domains and are not easy to verify; based on feedback from experts, we believe at least six solutions (2, 4, 5, 6, 9, 10) have a high chance of being correct, and some further ones look promising. We will only publish the solution attempts after midnight (PT), per the authors' guidance - the sha256 hash of the PDF is d74f090af16fc8a19debf4c1fec11c0975be7d612bd5ae43c24ca939cd272b1a . This was a side-sprint executed in a week mostly by querying one of the models we're currently training; as such, the methodology we employed leaves a lot to be desired. We didn't provide proof ideas or mathematical suggestions to the model during this evaluation; for some solutions, we asked the model to expand upon some proofs, per expert feedback. We also manually facilitated a back-and-forth between this model and ChatGPT for verification, formatting and style. For some problems, we present the best of a few attempts according to human judgement. We are looking forward to more controlled evaluations in the next round! 1stproof.org #1stProof

English

244

350

2.8K

2.5M

Keşfet

@SebastienBubeck @ahelkky @RonConway @garrytan @kevinweil @sama @fidjissimo @gdb