
Moritz
2K posts











Nennt mich altmodisch, aber ich will keine Hyperloops sondern einfach pünktliche Züge, ich will keine neuen Flugtaxis, sondern einfach zuverlässigen ÖPNV und ich will auch keine neuen Seilbahnen, bevor es nicht in jeder 🇩🇪 Stadt, sichere und zuverlässige Fahrradwege gibt. Warum ist das für deutsche Verkehrspolitiker eigentlich so schwer zu verstehen?



Can we really trust AI in critical areas like medical image diagnosis? No, and they are even worse than random. Our latest study, "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA," uncovers the stark limitations of state-of-the-art models like GPT-4V and Gemini Pro in medical VQA tasks. 🏥📚 Key Findings: 🔍 A simple yet effective probing with adversarial pairs reveals critical gaps in model performance. 🏥 We present the Probing Evaluation for Medical Diagnosis (ProbMed) dataset to rigorously assess LMM performance in medical imaging through probing evaluation and procedural diagnosis. 🩺 Top-performing models like GPT-4V and Gemini Pro perform worse than random guessing on specialized diagnostic questions. Models like LLaVA-Med struggle even with more general questions. Check out the full paper for all the details! 🔗 Website: jackie-2000.github.io/probmed.github… 📄 Paper: arxiv.org/abs/2405.20421 🤖 Data: huggingface.co/datasets/rippl… Kudos to my proud first-year PhD student @qianqi_yan and all collaborators @XuehaiH @xiangyue96! 🧵(1/N)



Kifferparty in Wasserburg am Inn endet mit Notarzteinsatz dlvr.it/T7RvbK




Google, FFS.







