Josh Vendrow

111 posts

Josh Vendrow

@josh_vendrow

Safety training @OpenAI | on leave from PhD at MIT

Katılım Aralık 2022

360 Takip Edilen355 Takipçiler

Josh Vendrow retweetledi

Jakub Pachocki@merettm·14 Şub

Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The problems require expertise in their respective domains and are not easy to verify; based on feedback from experts, we believe at least six solutions (2, 4, 5, 6, 9, 10) have a high chance of being correct, and some further ones look promising. We will only publish the solution attempts after midnight (PT), per the authors' guidance - the sha256 hash of the PDF is d74f090af16fc8a19debf4c1fec11c0975be7d612bd5ae43c24ca939cd272b1a . This was a side-sprint executed in a week mostly by querying one of the models we're currently training; as such, the methodology we employed leaves a lot to be desired. We didn't provide proof ideas or mathematical suggestions to the model during this evaluation; for some solutions, we asked the model to expand upon some proofs, per expert feedback. We also manually facilitated a back-and-forth between this model and ChatGPT for verification, formatting and style. For some problems, we present the best of a few attempts according to human judgement. We are looking forward to more controlled evaluations in the next round! 1stproof.org #1stProof

English

244

350

2.8K

2.5M

Josh Vendrow retweetledi

Cameron Raymond@CJKRaymond·3 Ara

for now i’m more interested in the easiest problem the model can’t solve, rather than the hardest one it can. we underestimate how important reliability is!

Sebastien Bubeck@SebastienBubeck

It's getting harder and harder to get signal from benchmark numbers. Rather than averages, I except in the (near) future we will also care about "argmax": what's the BEST output a model can deliver? After all, we don't need to solve PvsNP 10 out of 10 times, once is enough 😅. So with that in mind let me tell you a bit more about THE MOST IMPRESSIVE LLM OUTPUT I have ever seen.

English

3.7K

Josh Vendrow retweetledi

Sebastien Bubeck@SebastienBubeck·30 Kas

I can't tell if it's a joke or not, but no matter what it's very funny 🤣

English

775

60K

Josh Vendrow retweetledi

Factory@FactoryAI·20 Kas

We detected and disrupted a highly automated cyber operation attempting to use Factory as a node in a worldwide mesh of “off-label” LLM usage. The attackers deployed AI coding agents to generate and maintain their infrastructure, adapt to our defenses in real time, and orchestrate traffic from tens of thousands of synthetic organizations. This attack mirrored similar incidents across the industry, including those recently disclosed by @anthropicAI.

English

442

228.9K

Josh Vendrow retweetledi

Eno Reyes@EnoReyes·31 Eki

@dvendrow cooked with this one

Arif@rezaul_arif

Huge thanks to whoever revamped the MCP implementation in @FactoryAI CLI v0.22.6! 🫶🏼

English

1.4K

Josh Vendrow retweetledi

Mubashara Akhtar@akhtarmubashara·17 Eki

Check out our new weekly series @evaluatingevals where we spotlight papers on AI evaluations. 🔦 Kicking off with “Do Large Language Model Benchmarks Test Reliability?” by @josh_vendrow et al.

EvalEval Coalition@evaluatingevals

✨Weekly AI Evaluation Paper Spotlight✨ 🕵️ Is benchmark noise and label errors masking the true fragility of LLMs? 🖇️"Do Large Language Model Benchmarks Test Reliability?" - This paper by @josh_vendrow, @EdwardVendrow @sarameghanbeery @aleks_madry provides insights!

English

Josh Vendrow retweetledi

Adam Tauman Kalai@adamfungi·5 Eyl

New research explains why LLMs hallucinate, through a connection between supervised and self-supervised learning. We also describe a key obstacle that can be removed to reduce them. 🧵openai.com/index/why-lang…

English

102

320

1.4K

391.7K

Josh Vendrow retweetledi

Santiago Hernández@santiaghini·25 Ağu

I genuinely think that fixing hallucinations is the most important and under-discussed improvement from gpt-5. It’s so basic, but having a reliable coworker that you can trust enables such much more.

Noam Brown@polynoamial

GPT-5 Thinking definitely isn’t perfect, but it’s the first AI model I can trust more than many common sources of truth on the internet.

English

4.3K

Josh Vendrow@josh_vendrow·11 Ağu

@permaximum88 @aidan_mclau To make these improvements clearer, we added evaluations on prompts from existing open-ended factuality benchmarks (LongFact, FActScore) and saw huge improvements as well!

English

145

Josh Vendrow@josh_vendrow·11 Ağu

@permaximum88 @aidan_mclau Our focus when training GPT-5 was to decrease hallucinations on open-ended questions, which reflect what users actually experience far better than SimpleQA. That’s why we see huge improvements on prompts that represent production traffic.

English

345

Aidan McLaughlin@aidan_mclau·11 Ağu

one under-discussed element of gpt5 is it just hallucinates soooo much less, sweeping away 80% of o3-era ed zitronism and marcus-posting, but, because we’re good sports, we give them an evergreen batch of things to critique

English

100

900

67.5K

Josh Vendrow@josh_vendrow·10 Ağu

@ericmitchellai Incredibly bullish

English

119

Eric@ericmitchellai·10 Ağu

this is what progress looks like

Chubby♨️@kimmonismus

GPT-5 admits it "doesn't know" an answer! This is one of the huge improvements over previous models: instead of hallucinating, it lets you know its limits.

English

227

14.4K

Josh Vendrow@josh_vendrow·8 Ağu

@HanGuo97 @OpenAI First metric on which AI would surpass me honestly

English

Han Guo@HanGuo97·8 Ağu

@josh_vendrow @OpenAI Can you make GPT6 bench 3 plates?

English

116

Josh Vendrow@josh_vendrow·7 Ağu

Feels like the right moment to share that I recently joined @OpenAI on the safety training team and got to support this incredible launch!

OpenAI@OpenAI

GPT-5 is here. Rolling out to everyone starting today. openai.com/gpt-5/

English

1.9K

Josh Vendrow@josh_vendrow·7 Ağu

@ericmitchellai sicko mode

Eric@ericmitchellai·7 Ağu

ZXX

3.2K

Josh Vendrow retweetledi

Aleksander Madry@aleks_madry·31 Tem

Today is an episode I wanted to do for a while—a chat with the OpenAI’s power duo: its Chief Scientist @merettm and Technical Fellow @sidorszymon (and also my friends!).

English

458

75.5K

Josh Vendrow@josh_vendrow·30 Tem

@ion_barrel @kalomaze 👀 I've seen some nice GPUS here... personally.

English

Victor Butoi@ion_barrel·30 Tem

@kalomaze This is ✨ wrong ✨

English

126

kalomaze@kalomaze·28 Tem

it genuinely looks as if MIT doesn't actually have any H100s. please someone show me if i am reading this wrong? it can't be that bad right? right???

kalomaze@kalomaze

@stevenshinechen omg... if i'm reading this page correctly a majority of the GPUs you guys have campus wide access to are from before Ampere was even a thing > "...more than 850 NVidia Volta GPUs in total...."

English

516

70.1K

Josh Vendrow@josh_vendrow·23 Tem

Sometimes one simply cannot stop cooking.

Owain Evans@OwainEvans_UK

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

English

340

Josh Vendrow retweetledi

Aleksander Madry@aleks_madry·15 Tem

Season 2 of Before AGI rolls on! This week I sat down with Harvard Law professor & Berkman Klein Center co-founder @zittrain to ask: What happens when AI agents don’t just assist—but act for us? Insights into law, tech & trust in an autonomous-agent world incoming!

English

556

213.1K

Josh Vendrow retweetledi

Andrew Ilyas@andrew_ilyas·26 Haz

“How will my model behave if I change the training data?” Recent(-ish) work w/ @logan_engstrom: we nearly *perfectly* predict ML model behavior as a function of training data, saturating benchmarks for this problem (called “data attribution”).

English

430

57.8K

Josh Vendrow retweetledi

Giannis Daras@giannis_daras·16 Haz

Announcing Ambient Diffusion Omni — a framework that uses synthetic, low-quality, and out-of-distribution data to improve diffusion models. State-of-the-art ImageNet performance. A strong text-to-image results in just 2 days on 8 GPUs. Filtering ❌ Clever data use ✅

English

448

69.4K

Keşfet

@anthropicAI @dvendrow @evaluatingevals @permaximum88 @aidan_mclau @ericmitchellai @HanGuo97 @OpenAI