
ilge
494 posts

ilge
@ilge
Researcher @OpenAI | there is as yet insufficient data for a meaningful answer.
Palo Alto, CA Katılım Şubat 2011
324 Takip Edilen3.5K Takipçiler

ilge retweetledi
ilge retweetledi

2/n We officially competed in the online AI track of the IOI, where we scored higher than all but 5 (of 330) human participants and placed first among AI participants. We had the same 5 hour time limit and 50 submission limit as human participants. Like the human contestants, our system competed *without* internet or RAG, and just access to a basic terminal tool.

English
ilge retweetledi

Another one. Already a powerful painting, but moving around it yourself gives a totally different feeling.
Jacques Louis David's "The Death of Socrates" => #Genie3
English
ilge retweetledi

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

English
ilge retweetledi

Watching the model solve these IMO problems and achieve gold-level performance was magical. A few thoughts 🧵
Alexander Wei@alexwei_
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
English

Congrats, @isafulf , @EdwardSun0909 and the whole team behind this launch! 🚀
Truly a new era in the space of agentic products.
OpenAI@OpenAI
Today we are launching our next agent capable of doing work for you independently—deep research. Give it a prompt and ChatGPT will find, analyze & synthesize hundreds of online sources to create a comprehensive report in tens of minutes vs what would take a human many hours.
English
ilge retweetledi
ilge retweetledi
ilge retweetledi

don’t miss this part of today’s 12th Day of OpenAI: “Deliberative Alignment,” exciting work by the illustrious @MelodyGuan et al!
the technique achieves a Pareto improvement over previous approaches such as RLHF, and reduces overrefusals!
openai.com/index/delibera…
English
ilge retweetledi

Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks.
It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task in compute ) and 87.5% in high-compute mode (thousands of $ per task). It's very expensive, but it's not just brute -- these capabilities are new territory and they demand serious scientific attention.

English

ilge retweetledi
ilge retweetledi

come for my jokes, stay because @ilge and @polynoamial are explaining the next frontier of ai research!
Noam Brown@polynoamial
This episode is worth listening to just to hear @hunterlightman crack jokes for 45 minutes
English
ilge retweetledi

My colleagues and I will be hosting a talk and Q&A session on 'Learning to Reason with LLMs' and the new OpenAI o1 model. Join us for an insightful discussion!
forum.openai.com/public/events/… #OpenAIForum
English






