Mark Chen

416 posts

Mark Chen

Mark Chen

@markchen90

Chief Research Officer at @OpenAI. Coach for the USA IOI Team.

Katılım Haziran 2020
350 Takip Edilen70.5K Takipçiler
Sabitlenmiş Tweet
Mark Chen
Mark Chen@markchen90·
We wrapped up this year's competition circuit with a full score on the ICPC, after achieving 6th in the IOI, a gold medal at the IMO, and 2nd in the AtCoder Heuristic contest!
Mostafa Rohaninejad@MostafaRohani

1/n I’m really excited to share that our @OpenAI reasoning system got a perfect score of 12/12 during the 2025 ICPC World Finals, the premier collegiate programming competition where top university teams from around the world solve complex algorithmic problems. This would have placed it first among all human participants. 🥇🥇

English
33
46
793
454.5K
Mark Chen retweetledi
Michelle Pokrass
Michelle Pokrass@michpokrass·
we shipped a new version of 5.3 instant to chatgpt yesterday. 5.3 was unintentionally pretty annoyingly clickbait-y. it's better in yesterday's model and we're going to keep stamping that behavior out. keep the feedback coming! help.openai.com/en/articles/68…
English
76
25
452
54.2K
Mark Chen
Mark Chen@markchen90·
Insane how leaky OpenAI is smh
Tibo@thsottiaux

@SIGKITTEN How about we put codex into ChatGPT and then ChatGPT into the Codex that is within ChatGPT

English
41
27
924
186.3K
Mark Chen retweetledi
Tibo
Tibo@thsottiaux·
@0thernet Just use Codex. That might have been a single prompt and worked within your $20 sub
English
28
8
922
43.3K
Mark Chen
Mark Chen@markchen90·
If you give GPT-5.4 a raw dump of the GPT-2 weights and ask for a <5000 byte C program to inference it, GPT-5.4 succeeds in under 15 minutes! I remember working on a similar exercise to compare results against a proprietary model in a previous paper - it took days!
Hanson Wang@hansonwng

x.com/i/article/2029…

English
30
33
626
80.4K
Mark Chen retweetledi
Aidan McLaughlin
Aidan McLaughlin@aidan_mclau·
research has really been cooking with gas lately. feels like playing with a great orchestra
English
22
13
416
19.8K
Mark Chen retweetledi
Daniel Litt
Daniel Litt@littmath·
Some thoughts on AI and mathematics, inspired by "First Proof."
Daniel Litt tweet media
English
48
201
1.1K
330K
Mark Chen retweetledi
Jakub Pachocki
Jakub Pachocki@merettm·
Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The problems require expertise in their respective domains and are not easy to verify; based on feedback from experts, we believe at least six solutions (2, 4, 5, 6, 9, 10) have a high chance of being correct, and some further ones look promising. We will only publish the solution attempts after midnight (PT), per the authors' guidance - the sha256 hash of the PDF is d74f090af16fc8a19debf4c1fec11c0975be7d612bd5ae43c24ca939cd272b1a . This was a side-sprint executed in a week mostly by querying one of the models we're currently training; as such, the methodology we employed leaves a lot to be desired. We didn't provide proof ideas or mathematical suggestions to the model during this evaluation; for some solutions, we asked the model to expand upon some proofs, per expert feedback. We also manually facilitated a back-and-forth between this model and ChatGPT for verification, formatting and style. For some problems, we present the best of a few attempts according to human judgement. We are looking forward to more controlled evaluations in the next round! 1stproof.org #1stProof
English
243
357
2.8K
2.5M
Mark Chen
Mark Chen@markchen90·
Just because we focus on research doesn't mean we will pursue *all* research. Jakub and I have our own research tastes as well (I think we're fairly good tastemakers 😅), and we have to importance weight the various paths to AGI.
🍓🍓🍓@iruletheworldmo

@markchen90 @merettm sorry to ask. can you speak to why former employees state there reason for leaving as being unable to do research. there’s a lot of speculation you could directly address.

English
21
9
477
74.5K
Mark Chen
Mark Chen@markchen90·
How does OpenAI balance long-term research bets with product-forward research fundamentals? I’ve been getting this question a lot lately, usually framed as a suggestion that Jakub (@merettm) and I are pushing an increasingly product-focused agenda. That characterization is simply wrong. Foundational research has been core to OpenAI from the start, and today we run a research program with hundreds of exploratory projects - much like the ones that led to our reasoning-model breakthrough. The majority of our compute is allocated to foundational research and exploration - and not product milestones. Anyone who has spent time with me or Jakub knows we are the last people in the world who would push for the advancement of products over the advancement of research. We’re in the business of creating an automated scientist, and capabilities that were considered grand challenges just a few years ago (like IMO-level mathematical reasoning) now emerge as normal parts of the research process. We’re also seeing our models accelerate researchers worldwide, helping advance work across biology, mathematics, physics, and even our own research. Jakub and I put a lot of effort into ensuring that research stays focused on uncovering algorithms that will scale to the compute we’ll have a year from now. We protect mindshare and amplify discourse on exploratory work. We do this while recognizing that we’re also a deployment company - and that deployment gives us access to even larger-scale compute, richer feedback, and more room for exploration. Our researchers are passionate about having their work out in the world, and a special slice of our org is dedicated to making sure our deployments are delightful for end users. Our goal isn’t to turn research into a quarterly race. It’s to build a durable research engine - one that compounds learning over time and consistently turns long-horizon exploration into real, measurable advances, while ensuring those advances become valuable in the real world. That’s the roadmap we’re executing on. And while there have been ups and downs over the last decade (as you expect with any research program), I think most of our researchers would share my strong optimism today.
English
62
81
1.2K
441.9K
Mark Chen retweetledi
Leeham
Leeham@Liam06972452·
Erdős Problem #635 autonomously resolved by GPT-5.2 Pro. The model thought for just 50 mins, outputting a correct proof in Latex, then formalised in Lean by @HarmonicMath's Aristotle. Big thanks to @AcerFur for cleaning up the Lean. Literature review is ongoing.
Leeham tweet media
English
24
112
957
189K
Mark Chen retweetledi
Mark Chen
Mark Chen@markchen90·
Congrats to @bchesky and @Ahmad_Al_Dahle! Excited to see what new Airbnb experiences are unlocked when outstanding design meets excellent ML.
Brian Chesky@bchesky

.@Ahmad_Al_Dahle is joining as Airbnb's new CTO. I’m often asked about our AI strategy. We believe pairing great design with frontier technology will help us improve the way people experience travel. Excited to build!

English
3
5
179
73K
Mark Chen retweetledi
Bartosz Naskręcki
Bartosz Naskręcki@nasqret·
Either OpenAI has a team of leprechauns and top mathematicians working 24/7 on FrontierMath questions, or GPT-5.2 Pro has actually become that good at mathematics. I can hardly find any non-trivial hard problem that the model cannot answer after 1–2 hours of interaction. Singularity is near…
English
90
103
1.4K
189K
Mark Chen retweetledi
Poetiq
Poetiq@poetiq_ai·
We finally had a moment to run our system with GPT-5.2 X-High on ARC-AGI-2! Using the same Poetiq harness as before, we saw results as high as 75% at under $8 / problem using GPT-5.2 X-High on the full PUBLIC-EVAL dataset. This beats the previous SOTA by ~15 percentage points.
Poetiq tweet media
English
126
282
2K
990K
Mark Chen retweetledi
Miles Wang
Miles Wang@MilesKWang·
If AI could interact and learn from the physical world, could it make more scientific advances? We had GPT-5 optimize molecular cloning protocols in the wet lab. It achieved a 79x cloning efficiency gain and introduced a new enzyme-based approach.
Miles Wang tweet media
OpenAI@OpenAI

We’re also testing our models on real world lab experience. We worked with Red Queen Bio to test models to optimize protocols in the lab. GPT-5 proposed, ran (via a controlled framework), and iterated on experiments — increasing a standard molecular cloning protocol's efficiency by 79x with a variety of techniques, including a new enzyme-based approach. openai.com/index/accelera…

English
14
73
526
81.9K
Mark Chen
Mark Chen@markchen90·
@GregKamradt Some of the smartest people I know are incredibly data inefficient.
English
1
0
37
3.1K
Greg Kamradt
Greg Kamradt@GregKamradt·
If you’re so smart, why do you need so much training data?
English
48
14
211
16.6K
Mark Chen
Mark Chen@markchen90·
@BenjaminDEKR I agree that anything that feels like an ad needs to be handled with care, and we fell short. We’ve turned off this kind of suggestion while we improve the model’s precision. We’re also looking at better controls so you can dial this down or off if you don’t find it helpful.
English
20
8
132
36.1K
Benjamin De Kraker
Benjamin De Kraker@BenjaminDEKR·
OpenAI's @markchen90 says they are looking into the not-great "shop at Target" frustration. Giving them the benefit of the doubt to improve it. Mark always seems genuine to me. My 2 main points are really just: - This is essentially an ad - Let users opt-out or turn this off
English
24
4
79
9.6K
Benjamin De Kraker
Benjamin De Kraker@BenjaminDEKR·
I'm in ChatGPT (paid Plus subscription), asking about Windows BitLocker and it's F-ing showing me ADS TO SHOP AT TARGET. Yeah, screw this. Lose all your users.
Benjamin De Kraker tweet media
English
405
229
3.5K
477.2K