Post

We only have spotty information about this very important topic. It suggests AI can be good at diagnosis, but the real world doesn't always match the experiments. x.com/emollick/statu…
Ethan Mollick@emollick
Across most medical benchmarks, including when real cases & human doctors are involved, there is a clear trend of AI models improving over time (and many where today's AI beats human doctors) But we do not have many studies measuring real-world performance of AI in medicine, yet
English

Most importantly and probably the biggest bottleneck, is they need to stop testing against benchmarks that consist of cases that have been well curated and summarized by medical professionals. If people want to see how well these models perform, they need to allow random patients to input their own queries and allow the models to ask their own questions in response. The cognitive disorganization of real patients without a doctor filtering it for the models is probably a significant limitation in training.
English
