Nate retweetledi
Nate
3 posts

Nate
@Nateyoung23
Information Systems @cmuniversity
Gladstone, MI Katılım Aralık 2021
73 Takip Edilen41 Takipçiler
Nate retweetledi

Whoa. This new GDPval score is a very big deal.
Probably the most economically relevant measure of AI ability suggesting that in head-to-head competition with human experts on tasks that require 4-8 hours for a human to do, GPT-5.2 wins 71% of the time as judged by other humans

Ethan Mollick@emollick
After reading it, this does seem like a big deal Industry experts outlined important, real-world, hard tasks for AI to do. Other experts were asked to do the tasks themselves & yet others graded human & AI output Models approached parity with humans & AI is getting better fast.
English

