Raja-Elie Abdulnour

537 posts

Raja-Elie Abdulnour banner
Raja-Elie Abdulnour

Raja-Elie Abdulnour

@BageLeMage

Editor-in-chief @NEJMClinician, Clinical development and AI innovation @NEJM - Views are my own.

Boston Katılım Mart 2010
164 Takip Edilen889 Takipçiler
Raja-Elie Abdulnour retweetledi
Eric Topol
Eric Topol@EricTopol·
Expressing uncertainty is a major weak spot of LLMs in medicine @NEJM "Can AI Say I Don't Know?" Good lines: "Contemporary LLMs have passed many Turing tests, but will they pass this modern test of not knowing? We don’t know." nejm.org/doi/full/10.10…
Eric Topol tweet media
English
13
55
214
28.9K
Raja-Elie Abdulnour
Raja-Elie Abdulnour@BageLeMage·
Two key takeaways. 1) Clinician knoweldge and experience matters: AI explanations improved diagnostic accuracy of radiologists, but only for specialists given images in their own domain, and those with >12 yrs experience. 2) When AI gave an incorrect differential diangosis, 60-80% of clinicians endorsed it, but very few did when AI gave a single Dx with/without an explanation. nature.com/articles/s4174…
Raja-Elie Abdulnour tweet media
English
0
0
4
161
Raja-Elie Abdulnour
Raja-Elie Abdulnour@BageLeMage·
"AI as mental contaminant" framework is basically the Sophons without Trisolarans to blame. The contamination is everywhere: emergent, distributed, and optimized for engagement. That's the harder problem. frontiersin.org/journals/artif…
Raja-Elie Abdulnour tweet media
English
1
0
2
84
Mustafa Suleyman
Mustafa Suleyman@mustafasuleyman·
Our paper landed in Nature Health today! Healthcare is one of the most high-stakes, high-potential applications of AI. So we set out to understand how people actually use it in our AI products today. nature.com/articles/s4436…
Mustafa Suleyman tweet media
English
24
67
375
36.3K
Raja-Elie Abdulnour
Raja-Elie Abdulnour@BageLeMage·
A taxonomy of LLM-associated psychotic phenomena based on the AI's role: catalyst (precipitating new symptoms), amplifier (worsening pre-existing psychiatric symptoms), coauthor (participating in the development of harmful narratives), or object (becoming the focus of delusions). Very useful. thelancet.com/journals/landi…
English
0
0
2
132
Raja-Elie Abdulnour
Raja-Elie Abdulnour@BageLeMage·
This why medical experts’ reasoning MUST be embedded in AI-driven clinical decision support tools. Big question is how. RAG with high quality medical content is only one of the possible ways. Are there other ways? Yes there are.
Nav Toor@heynavtoor

🚨BREAKING: OpenAI published a paper proving that ChatGPT will always make things up. Not sometimes. Not until the next update. Always. They proved it with math. Even with perfect training data and unlimited computing power, AI models will still confidently tell you things that are completely false. This isn't a bug they're working on. It's baked into how these systems work at a fundamental level. And their own numbers are brutal. OpenAI's o1 reasoning model hallucinates 16% of the time. Their newer o3 model? 33%. Their newest o4-mini? 48%. Nearly half of what their most recent model tells you could be fabricated. The "smarter" models are actually getting worse at telling the truth. Here's why it can't be fixed. Language models work by predicting the next word based on probability. When they hit something uncertain, they don't pause. They don't flag it. They guess. And they guess with complete confidence, because that's exactly what they were trained to do. The researchers looked at the 10 biggest AI benchmarks used to measure how good these models are. 9 out of 10 give the same score for saying "I don't know" as for giving a completely wrong answer: zero points. The entire testing system literally punishes honesty and rewards guessing. So the AI learned the optimal strategy: always guess. Never admit uncertainty. Sound confident even when you're making it up. OpenAI's proposed fix? Have ChatGPT say "I don't know" when it's unsure. Their own math shows this would mean roughly 30% of your questions get no answer. Imagine asking ChatGPT something three times out of ten and getting "I'm not confident enough to respond." Users would leave overnight. So the fix exists, but it would kill the product. This isn't just OpenAI's problem. DeepMind and Tsinghua University independently reached the same conclusion. Three of the world's top AI labs, working separately, all agree: this is permanent. Every time ChatGPT gives you an answer, ask yourself: is this real, or is it just a confident guess?

English
2
0
4
1.5K