KonstantinosBalaskas

326 posts

KonstantinosBalaskas

KonstantinosBalaskas

@KonBalaskas

Professor of Ophthalmology|Medical Artificial Intelligence and Digital Eye Health| UCL Institute of Ophthalmology

London, England เข้าร่วม Ağustos 2016
462 กำลังติดตาม900 ผู้ติดตาม
KonstantinosBalaskas
KonstantinosBalaskas@KonBalaskas·
@rohanpaul_ai Spot-on—your MedAgentAudit breakdown shows how accuracy hides "broken teamwork" in med AI. HERMES cRCT echoes: DeepMind AI AUC 0.99 → 0.56 real-world, 2x false referrals vs. clinicians. Teleophthalmology cuts false-positives 59%. Evidence > illusions. Lancet: x.com/KonBalaskas/st… (DM for NIHR economic eval)
English
0
0
0
20
Rohan Paul
Rohan Paul@rohanpaul_ai·
This is not a great news for Medical AI. ☹️ Shows that medical AI Agents built from multiple LLMs often look correct but think wrong. Even when they give the right diagnosis, most of them got there through broken teamwork. In over 68% of “successful” cases, the agents already agreed at the start, so the group talk did nothing useful. The authors reviewed 3,600 medical cases and found repeating problems, like correct facts disappearing mid-discussion, smart minority opinions getting ignored, and agents choosing easy votes instead of reasoning. They built a tool called AuditTrail to track how facts and opinions move during a debate. It found that evidence often drops between early steps and the final answer, and that longer talks sometimes make things worse because the models start copying each other instead of thinking. Even worse, the systems often pick low-risk answers when a life-threatening one was on the table. So the scary part is this, a “high-accuracy” medical AI can sound smart but still hide unsafe logic underneath. Overall this paper exposes how accuracy numbers lie. An AI can pass medical tests yet still reason in unsafe, careless, or biased ways, which makes it untrustworthy for real patients. --- Paper – arxiv. org/abs/2510.10185 Paper Title: "MedAgentAudit: Diagnosing and Quantifying Collaborative Failure Modes in Medical Multi-Agent Systems"
Rohan Paul tweet media
English
53
138
491
126.2K
KonstantinosBalaskas
KonstantinosBalaskas@KonBalaskas·
Pallavi Bagga Gunjan Naik @GongyuZhang Ana Paula Ribeiro Reis Estelle Ioannidou Rosana Lima Georgina Wignall 👏❤️
Català
1
0
1
162