Peter Welinder

2.1K posts

Peter Welinder banner
Peter Welinder

Peter Welinder

@npew

VP and GM @OpenAI

San Francisco, CA Katılım Nisan 2010
928 Takip Edilen42.9K Takipçiler
Peter Welinder retweetledi
vittorio
vittorio@IterIntellectus·
this is actually insane > be tech guy in australia > adopt cancer riddled rescue dog, months to live > not_going_to_give_you_up.mp4 > pay $3,000 to sequence her tumor DNA > feed it to ChatGPT and AlphaFold > zero background in biology > identify mutated proteins, match them to drug targets > design a custom mRNA cancer vaccine from scratch > genomics professor is “gobsmacked” that some puppy lover did this on his own > need ethics approval to administer it > red tape takes longer than designing the vaccine > 3 months, finally approved > drive 10 hours to get rosie her first injection > tumor halves > coat gets glossy again > dog is alive and happy > professor: “if we can do this for a dog, why aren’t we rolling this out to humans?” one man with a chatbot, and $3,000 just outperformed the entire pharmaceutical discovery pipeline. we are going to cure so many diseases. I dont think people realize how good things are going to get
vittorio tweet mediavittorio tweet mediavittorio tweet mediavittorio tweet media
Séb Krier@sebkrier

This is wild. theaustralian.com.au/business/techn…

English
2.5K
19.9K
117.9K
17.3M
Peter Welinder
Sitting in a robot car, talking to an AI, watching my agents work. The future is here.
English
39
9
201
9.1K
Peter Welinder retweetledi
Cursor
Cursor@cursor_ai·
We're sharing a new method for scoring models on agentic coding tasks. Here's how models in Cursor compare on intelligence and efficiency:
Cursor tweet media
English
207
255
2.9K
608.5K
Peter Welinder retweetledi
Yam Peleg
Yam Peleg@Yampeleg·
GPT-5.4 is honestly fantastic, what a great model.
English
47
15
515
20.6K
Peter Welinder retweetledi
Dev Ed
Dev Ed@developedbyed·
codex 5.3 literally gives me the same amount of usage on the $20 plan as claude max lol
English
203
87
4K
371.6K
Peter Welinder retweetledi
Peter Welinder retweetledi
Derya Unutmaz, MD
Derya Unutmaz, MD@DeryaTR_·
I’ve had early access to GPT-5.4 Pro. Without any reservation, I can say it is the most intelligent AI model to date, even significantly surpassing GPT-5.2 Pro at several levels! I’ve been using it non-stop past several days and am super excited about another major jump in AI! I will share specific examples, but overall GPT-5.4 Pro demonstrates relatively higher creativity, insight, and abstract intelligence. It tends to ask “why,” “what if,” “can I,” and “why it matters” type questions more frequently than the 5.2 Pro model. It also appears to generalize more effectively and comes across as more AGI-like in its reasoning, and even displays human-like intuition! Especially biomedical science-based responses are unifying large data sets and simply amazing!
English
41
58
873
50.2K
Peter Welinder retweetledi
Sergey Karayev
Sergey Karayev@sergeykarayev·
GPT 5.4 in the Codex harness hit ALL-TIME HIGHS on our Rails benchmark. Both cheaper and better than GPT 5.2 and Opus/Sonnet models (in the Claude Code harness). You can test it yourself on your own codebase (whatever the tech stack) below.
Sergey Karayev tweet media
English
11
11
164
10.2K
Peter Welinder retweetledi
Tejal Patwardhan
Tejal Patwardhan@tejalpatwardhan·
GPT-5.4 is state-of-the-art on GDPval, and here are some examples of how the model is much better at well-specified knowledge work tasks 6mos ago the models could barely make a spreadsheet or slide! progress is happening really fast
Tejal Patwardhan tweet mediaTejal Patwardhan tweet media
English
17
24
348
59.9K
Peter Welinder retweetledi
Yann Dubois
Yann Dubois@yanndubs·
🔥Two things I'm esp excited about 5.4: 1. Unification: we merged our codex & mainline models 2. Efficiency: we brought the efficiency of 5.3-codex to CUA & knowledge work. We only showed 3 such plots in the blog but many of our evals required less time (tokens/tools) than 5.2. What should we fix for the next model?
Yann Dubois tweet mediaYann Dubois tweet mediaYann Dubois tweet media
English
51
29
561
44.3K
Peter Welinder retweetledi
ben
ben@benhylak·
i’ve been using gpt 5.4 for the past few weeks. in a sea of endless model drops and benchmark maxxing, this model is the first in a long time to be worth your time to try. honestly didn’t expect openai to pull this off.
English
104
45
1.4K
862.6K
Peter Welinder retweetledi
Dan Shipper 📧
Dan Shipper 📧@danshipper·
BREAKING: @OpenAI just released GPT-5.4 and it is AMAZING. We spent a week @every putting it through real engineering tasks from code reviews to planning workflows and using it inside of our @openclaw setups. The verdict: OpenAI is back in the coding race. - Its planning capability consistently beat Codex 5.3 and Opus 4.6 in head-to-head tests. It produces plans that are thorough and technically precise, and have a user focus and “human” feel that has been missing from OpenAI's previous coding mode - It reviews code with more depth than 5.3 Codex, and a much more conversational voice that doesn't make you feel dumb. - It became our go-to model in @OpenClaw: with some model-specific tweaks to the harness it's fast, intelligent, and more human. It's also about half the price of Opus 4.6. As ever, there are tradeoffs: - GPT-5.4 has a tendency to expand the task well beyond what you asked for and to call tasks done before they're finished. - In the @OpenClaw harness it sometimes completed tasks in obviously wrong ways, then lied about it. Overall though, it's my new daily driver for coding and in my Claw. Its thinking-traces produced some genuine wow moments for me. Our complete vibe check is available on @every now -> every.to/vibe-check/gpt…
English
44
41
503
69.9K
Peter Welinder retweetledi
Matt Shumer
Matt Shumer@mattshumer_·
I've been testing GPT-5.4 for the last week. In short, it is the best model in the world, by far. It's so good that it's the first model that makes the “which model should I use?” conversation feel almost over. The biggest surprise: I barely use Pro anymore! If you know me, you know I'm a Pro addict. I reach for Pro models constantly, and use them for almost everything, as they just... nail almost anything I give to them. For the first time, 5.4's standard version, with heavy thinking, just broke that habit. Even in standard mode, GPT-5.4 is better than previous models in Pro mode... crazy! Coding capabilities are ridiculous... it's essentially flawless. Inside Codex, it's insanely reliable. Coding is essentially solved. There's not much more to say on this, it's just THAT good. The Pro version is near-perfect. Other testers I spoke with saw it solving problems that were unsolvable by any other model. At this point, Pro is overkill for almost every normal use-case, but when you really need the power to do something extremely difficult, it's incredible. Consistent with everything I've said above, even the standard thinking version uses fewer reasoning tokens than previous models to get the same level of results. In practice, this means you get great results much faster than before. This was one of my biggest gripes with previous OpenAI models. They just took too long to complete simple tasks. Assuming the speed we had during testing holds up as more users join, this is going to be a big win for OpenAI. It still has weaknesses, though: - Frontend taste is FAR behind Opus 4.6 and Gemini 3.1 Pro. , why is this so hard to fix? @OpenAI once you fix this, there's literally no reason for me to use any other model. Please please please do it! - It can still miss obvious real-world context. For example, I had it plan an itinerary for a trip. At first glance, it looked perfect, but it failed to take into account that it chose locations that would be mobbed by spring breakers, so I had to re-run the prompt from scratch with more context. - When testing it inside OpenClaw, it kept stopping short before finishing tasks. I'm assuming this will be fixed quickly, but it's still worth noting. But zooming out: This thing is so far ahead overall that the nitpicks are starting to feel beside the point. GPT-5.4 is a serious fucking model. The best model in the world. By far.
Matt Shumer tweet media
English
335
233
3K
1.5M
Peter Welinder retweetledi
Mitchell Hashimoto
Mitchell Hashimoto@mitchellh·
Ahhhh, Codex 5.3 (xhigh) with a vague prompt just solved a bug that I and others have been struggling to fix for over 6 months. Other reasoning levels with Codex failed, Opus 4.6 failed. Cost $4.14 and 45 minutes. Full trace plus includes original issue: ampcode.com/threads/T-019c… I know this prompt is relatively bad. Honestly, our stable release is in a week, and I was throwing some Hail Marys at the frontier models to see if I could get a clean, understandable fix for some of these bugs. By using `gh`, it grabs much better context from the issue, so its not terrible. The best thing that Codex did was eventually start reading GTK4 source code. That's where I ended up (see my GH issue), and I knew the answer was somewhere in there, but I didn't have the time or motivation to do it myself. The other models never went there, and lower reasoning efforts with 5.3 didn't go there either. Only xhigh went there. I think that was a critical difference. The final fix was decent. It was small, all in a single file, and very understandable. It had one bug I identified (you can see in the trace), and then I manually cleaned up some style. But, it did a great job. Definitely an "it's so over" moment. But at the same time, it feels amazing because now our next stable release will have this fix and I was able to spend the time working on other fixes as it went.
English
120
223
3.6K
399.7K
Peter Welinder retweetledi
Yana Welinder
Yana Welinder@yanatweets·
Putting Nano Banana 2 to the test. Nano Banana 2 vs. ChatGPT Image 1.5. Same sketch. Same prompt. ChatGPT Image 1.5 has stronger instruction following and better taste.
Yana Welinder tweet media
English
21
4
35
5.8K
Peter Welinder retweetledi
JB
JB@JasonBotterill·
Sorry for shitting on 5.3-Codex but I’ve actually seen the light today. after using it for 5 hours today im codex pilled now I run it at high reasoning and it’s actually way faster at responding now.
English
20
8
431
22.2K
Peter Welinder retweetledi
Lovable
Lovable@Lovable·
Lovable now uses GPT-5.3-Codex for solving the most complex problems. It is significantly stronger and 3-4x more token-efficient than GPT-5.2.
Lovable tweet media
English
82
56
1.2K
99.9K
Peter Welinder retweetledi
Mariusz Kurman
Mariusz Kurman@mkurman88·
Those who wrote "Try Codex" when I was hyping CC were right. Codex 5.3 is another level. It delivers so much with such high quality - I'm literally shocked. 5.2 was a mess; 5.3 is in another league.
English
40
23
553
47.2K