NonpartisanEducation

5.9K posts

NonpartisanEducation banner
NonpartisanEducation

NonpartisanEducation

@NPEreview

Nonpartisan Education Group, Outside Both Boxes. Forum for those interested in education policy & not aligned with vested interests of either political party.

USA Katılım Temmuz 2011
220 Takip Edilen417 Takipçiler
NonpartisanEducation retweetledi
Catherine Johnson
Catherine Johnson@smarterparrot·
Haven’t read the paper, but the apparent fact that the LLMs are getting worse, not better, is what you’d expect given that performance deteriorates when AIs train on their own output People assume that technology gets better over time, but it’s not true in this case (The fact that LLMs produce inferior training material ought to give the evangelists pause, but no) “The numbers are wild: → OpenAI's o1 model: 16% hallucination rate → Their o3 model: 33% → Their newest o4-mini: 48%”
Saidul@saidul_dev

OpenAI just published a paper proving that ChatGPT will always hallucinate. Not sometimes. Not "until the next version." Always. They proved it mathematically. And three other top AI labs confirmed it independently. Here's what the research actually shows: Even with perfect training data and unlimited compute, LLMs will still fabricate answers with complete confidence. This isn't a bug in the code. It's fundamental to how these systems are built. The numbers are wild: → OpenAI's o1 model: 16% hallucination rate → Their o3 model: 33% → Their newest o4-mini: 48% Nearly half of what their latest model tells you could be invented. And it's getting worse as models get "smarter." Here's why this can't be fixed: Language models predict the next word based on probability. When they hit uncertainty, they don't pause. They don't flag it. They guess with total confidence. Because that's literally what they were trained to do. The researchers analyzed the 10 major AI benchmarks used to test these models. 9 out of 10 give the exact same score for saying "I don't know" as for getting it completely wrong: zero points. The entire testing system punishes honesty and rewards confident guessing. So the AI learned the optimal strategy: always answer. Never show doubt. Sound certain even when making it up. OpenAI's proposed solution? Train models to say "I don't know" when uncertain. The problem? Their own math shows this would leave roughly 30% of questions unanswered. Imagine getting "I'm not confident enough to respond" three times out of ten. Users would abandon the product overnight. The fix exists. But it kills usability. This isn't just OpenAI's problem. DeepMind and Tsinghua University reached identical conclusions working separately. Three elite AI labs. Independent research. Same result: this is permanent. Every time you get an answer from any LLM, you're not getting facts. You're getting the most statistically probable next words from a system that's been rewarded for never admitting when it's guessing. Is this real information, or just a confident hallucination? You can't know. And neither can the AI.

English
2
2
12
1.5K
NonpartisanEducation retweetledi
Ling Huang
Ling Huang@FightFuzzyMath·
My articles and presentations on how progressive education dogmas have corrupted K-12 math education in the US and beyond: •Jo Boaler’s Fame, Stanford’s Shame; Students’ Gloom, America’s Doom: rb.gy/amoc52 •Jo Boaler's Reform Math Fallacy: bit.ly/38oASeE •Stanford Professor Jo Boaler’s Math Revolution and War Against Algebra 2: nonpartisaneducation.org/blog1/2020/10/… •Why Have American Schools Failed in Closing the Achievement Gap? A Case Study of California’s Palo Alto School District: nonpartisaneducation.org/Review/Testimo… •Letters from Mathematicians: nonpartisaneducation.org/Review/Resourc… Letters from R.James Milgram and Wayne Bishop on the deterioration of K-12 math education. •是谁夺走了美国人的数学能力? 美国百年数学战争演义 (Who Deprived Americans of Their Math Ability? A Saga of Century-long American Math Wars) rb.gy/xvg3cf
English
2
4
16
1.7K
NonpartisanEducation retweetledi
Alt B
Alt B@AltB56073878·
Alt B tweet media
ZXX
1
1
0
81