Paul Ashbourne

447 posts

Paul Ashbourne

@paulashbourne

agents + rl infra @openai | 💬 Opinions are my own | Made in Canada 🇨🇦

San Francisco, CA Katılım Ocak 2011

371 Takip Edilen396 Takipçiler

Paul Ashbourne@paulashbourne·28 Mar

This was the rate of progress in the 1900s. Imagine what the world will look like 66 years from now.

Wonder of Science@wonderofscience

These two photographs are separated by only 66 years.

English

Paul Ashbourne retweetledi

roon@tszzl·6 Mar

ZXX

1.7K

71.1K

Paul Ashbourne retweetledi

Seán Ó hÉigeartaigh@S_OhEigeartaigh·5 Mar

Real slap in the face to every OpenAI person who spoke up in Anthropic's support. Anthropic staff should be embarrassed - not that this was leaked, but that it was sent. After weeks of calling for industry solidarity, I'm embarrassed.

English

233

30.1K

Paul Ashbourne retweetledi

Dean W. Ball@deanwball·5 Mar

I do not share the cynicism of some with respect to OpenAI’s actions in the DoW/Ant dispute. It basically seems to me as though OpenAI was attempting to deescalate last week; whether they executed well is a separate question, but in their defense good execution in such chaos was nearly impossible. But from where I sit it seems OpenAI tried to reduce tensions and find a productive path forward, while allowing its employees considerable latitude to speak their minds. The easy thing would have been for management to stay quiet and let this happen; they did not do that, and they also stood firm in opposition to the supply-chain risk designation. In general, OpenAI is unjustly maligned. This is the thing that bothers me the most about Dario’s leaked memo; it spends so much time on OpenAI conspiracies and cynicism that I fear industry solidarity in the future will be harder than it needs to be. This is not the last time we will see state interference into frontier AI, and until we build formalized structures for such interference it will be important for the industry to hang tough together. I fear that will be less likely now.

English

519

42.2K

Paul Ashbourne@paulashbourne·27 Oca

@mattyglesias And also generally offended by the constitution x.com/katiemiller/st…

Katie Miller@KatieMiller

OPEN AI EXEC ⬇️⬇️⬇️

English

1.2K

Paul Ashbourne@paulashbourne·21 Eki

@m_franceschetti Sounds like a plan. Good luck to you and the team sprinting on outage mode, and thanks for leading transparently on this. 🚀

English

13.2K

Matteo Franceschetti@m_franceschetti·21 Eki

@paulashbourne Thanks Paul. Let me fix this first, ship an outage mode and then we will look at the next features ;)

English

188.9K

Matteo Franceschetti@m_franceschetti·21 Eki

The AWS outage has impacted some of our users since last night, disrupting their sleep. That is not the experience we want to provide and I want to apologize for it. We are taking two main actions: 1) We are restoring all the features as AWS comes back. All devices are currently working, with some experiencing data processing delays. 2) We are currently outage-proofing your Pod experience and we will be working tonight-24/7 until that is done. More updates soon.

English

673

195

4.8K

7.9M

Paul Ashbourne@paulashbourne·1 Eki

sora 2 is essentially interdimensional cable, but short form

English

Paul Ashbourne@paulashbourne·30 Eyl

@BjornSchmitz2 DM'd you a code!

English

Paul Ashbourne retweetledi

OpenAI@OpenAI·30 Eyl

10am PT.

Português

539

481

5.7K

2.1M

Paul Ashbourne@paulashbourne·30 Eyl

@abshkbh Great to have you on the team, Abhishek!

English

730

Abhishek Bhardwaj@abshkbh·30 Eyl

For the past year I’ve been building Arrakis on a single thesis: with the right tools and secure environments, LLMs can reliably do complex work. This journey started two years ago when I left a stable role at Google to work on early coding agents. While still at Google, I wrote a long email to @gdb about how a systems engineer could break into AI. Arrakis opened doors and has led to a full-circle moment: I’ve joined @OpenAI to work on Agent Infrastructure in the Scaling org. It’s a privilege to help people through smarter models and agents. I’m especially excited about our coding initiatives. Thank you @gdb and @paulashbourne for the opportunity. Looking back, the biggest risk was not taking one!

English

399

121.3K

Paul Ashbourne@paulashbourne·23 Ağu

@tszzl @paulg Hard to beat 575

English

1.5K

roon@tszzl·23 Ağu

@paulg I liked the old one better tbh

English

27K

Paul Graham@paulg·23 Ağu

I finally went to visit OpenAI's new building. It's the nicest office I've ever seen. So many different shaped spaces, and such good color. Whoever was in charge of this did a really good job.

English

223

106

6.7K

747K

Paul Ashbourne retweetledi

Artificial Analysis@ArtificialAnlys·7 Ağu

OpenAI gave us early access to GPT-5: our independent benchmarks verify a new high for AI intelligence. We have tested all four GPT-5 reasoning effort levels, revealing 23x differences in token usage and cost between the ‘high’ and ‘minimal’ options and substantial differences in intelligence We have run our full suite of eight evaluations independently across all reasoning effort configurations of GPT-5 and are reporting benchmark results for intelligence, token usage, and end-to-end latency. What @OpenAI released: OpenAI has released a single endpoint for GPT-5, but different reasoning efforts offer vastly different intelligence. GPT-5 with reasoning effort “High” reaches a new intelligence frontier, while “Minimal” is near GPT-4.1 level (but more token efficient). Takeaways from our independent benchmarks: ⚙️ Reasoning effort configuration: GPT-5 offers four reasoning effort configurations: high, medium, low, and minimal. Reasoning effort options steer the model to “think” more or less hard for each query, driving large differences in intelligence, token usage, speed, and cost. 🧠 Intelligence achieved ranges from frontier to GPT-4.1 level: GPT-5 sets a new standard with a score of 68 on our Artificial Analysis Intelligence Index (MMLU-Pro, GPQA Diamond, Humanity’s Last Exam, LiveCodeBench, SciCode, AIME, IFBench & AA-LCR) at High reasoning effort. Medium (67) is close to o3, Low (64) sits between DeepSeek R1 and o3, and Minimal (44) is close to GPT-4.1. While High sets a new standard, the increase over o3 is not comparable to the jump from GPT-3 to GPT-4 or GPT-4o to o1. 💬 Token usage varies 23x between reasoning efforts: GPT-5 with High reasoning effort used more tokens than o3 (82M vs. 50M) to complete our Index, but still fewer than Gemini 2.5 Pro (98M) and DeepSeek R1 0528 (99M). However, Minimal reasoning effort used only 3.5M tokens which is substantially less than GPT-4.1, making GPT-5 Minimal significantly more token-efficient for similar intelligence. 📖 Long Context Reasoning: We released our own Long Context Reasoning (AA-LCR) benchmark earlier this week to test the reasoning capabilities of models across long sequence lengths (sets of documents ~100k tokens in total). GPT-5 stands out for its performance in AA-LCR, with GPT-5 in both High and Medium reasoning efforts topping the benchmark. 🤖 Agentic Capabilities: OpenAI also commented on improvements across capabilities increasingly important to how AI models are used, including agents (long horizon tool calling). We recently added IFBench to our Intelligence Index to cover instruction following and will be adding further evals to cover agentic tool calling to independently test these capabilities. 📡 Vibe checks: We’re testing the personality of the model through MicroEvals on our website which supports running the same prompt across models and comparing results. It’s free to use, we’ll provide an update with our perspective shortly but feel free to share your own! See below for further analysis:

English

125

712

105.2K

Paul Ashbourne@paulashbourne·7 Ağu

@jasondeanlee This tweet isn't aging well

English

127

Jason Lee@jasondeanlee·7 Ağu

How do I short oai before gpt5 release?

English

10.1K

Paul Ashbourne@paulashbourne·7 Ağu

lfg

OpenAI@OpenAI

wen GPT-5? In 10 minutes. openai.com/live

QST

636

Paul Ashbourne@paulashbourne·7 Ağu

There are going to be a lot of high 5s going around the @openai office tomorrow

English

547

Paul Ashbourne retweetledi

Sebastien Bubeck@SebastienBubeck·19 Tem

It’s hard to overstate the significance of this. It may end up looking like a “moon‑landing moment” for AI. Just to spell it out as clearly as possible: a next-word prediction machine (because that's really what it is here, no tools no nothing) just produced genuinely creative proofs for hard, novel math problems at a level reached only by an elite handful of pre‑college prodigies.

Alexander Wei@alexwei_

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

English

156

1.4K

261K

Paul Ashbourne@paulashbourne·30 Haz

@anupk24 @ns123abc 4th of July is one thing, but just wait until they find out that we do the same thing at Thanksgiving

English

801

Anup@anupk24·30 Haz

@ns123abc This happens literally every year and has been on my calendar for months...

English

105

6.9K

NIK@ns123abc·30 Haz

🚨NEWS: OpenAI is officially shutting down next week “to give employees time to recharge” LMAO

English

180

1.9K

284.5K

Paul Ashbourne@paulashbourne·28 Haz

For most people, legal advice is financially out of reach. AI will level playing the field, so long as we can offer the same legally protected privileges as our justice system has always offered those who could afford professional legal advice.

Sam Altman@sama

AI privacy is critically important as users rely on AI more and more. the new york times claims to care about tech companies protecting user’s privacy and their reporters are committed to protecting their sources. but they continue to ask a court to make us retain chatgpt users' conversations when a user doesn't want us to. this is not just unconscionable, but also overreaching and unnecessary to the case. we’ll continue to fight vigorously in court today. i believe there should be some version of "AI privilege" to protect conversations with AI.

English

438

Paul Ashbourne@paulashbourne·3 Haz

Excited to bring Codex to Plus users today, along with our most requested feature: Internet access during task execution!

OpenAI Developers@OpenAIDevs

We’re rolling out a few updates to Codex today: 1. Codex is rolling out to ChatGPT Plus users today. It includes generous usage limits for a limited time, but during periods of high demand, we might set rate limits for Plus users so that Codex remains widely available.

English

437

Keşfet

@mattyglesias @m_franceschetti @BjornSchmitz2 @abshkbh @gdb @OpenAI @tszzl @paulg