Maya Varma

25 posts

Maya Varma

Maya Varma

@mayavarma23

@Stanford CS PhD student // @KnightHennessy, @NDSEG, & @QuadFellowship

Stanford, CA Katılım Ekim 2020
67 Takip Edilen124 Takipçiler
Maya Varma retweetledi
AK
AK@_akhaliq·
SMMILE An Expert-Driven Benchmark for Multimodal Medical In-Context Learning
English
1
10
42
21.8K
Maya Varma retweetledi
Michael Moor
Michael Moor@Michael_D_Moor·
🚨New preprint! 🚨In-context learning (ICL) is the intriguing ability of LLMs to learn to solve tasks purely from context w/o parameter updates. For multimodal LLMs (MLLMs), ICL is poorly understood, especially in the medical domain where doctors would often face few relevant prior cases, prior imaging studies etc. But can MLLMs actually learn from context in the medical domain? 🤔 To answer this question, we introduce SMMILE— an expert-curated benchmark for multimodal ICL in the medical domain! TL;DR: 💡MLLMs perform surprisingly poorly at multimodal ICL in the medical domain! --> ICL performance sometimes actually drops—some models perform worse than random baselines. Models have still a way to go here! Previously, many ICL evaluations leveraged random, potentially irrelevant examples -> with the help of clinical experts, here we carefully craft explicit task demonstrations to more robustly probe the ICL ability. 📊 The benchmark: -> 111 expert-curated problems (517 ICL problems) -> 6 medical specialties, 13 imaging modalities -> SMMILE++ variant with 1,038 permuted problems -> Comprehensive evaluation of 15 state-of-the-art MLLMs Check it out: 🔗 Page: smmile-benchmark.github.io 📜 Paper: arxiv.org/abs/2506.21355 💻 Code: github.com/eth-medical-ai… 📁 Dataset: huggingface.co/smmile Great collab with stellar colleagues @mrieff_02* @mayavarma23* (co-first) @IAMJBDEL# (co-last) and more!
English
2
19
86
8K
Maya Varma retweetledi
JB
JB@IAMJBDEL·
💥 We unveil our paper accepted at the #ACL2025 Main Conference: Automated Structured Report Generation Let's revisit automated radiology report generation for CXR. Free-form reports make it hard for AI systems to learn accurate generation, and even harder to evaluate. 🧵👇 @StanfordAIMI @hopprai
English
1
6
4
1K
Maya Varma retweetledi
Magda Paschali
Magda Paschali@magdapasc·
🧵 What if AI could learn from millions of unlabeled radiology images and reports—and then flexibly adapt to new clinical tasks? In a new comprehensive review in @radiology_rsna, we dive into how foundation models (FMs) are set to revolutionize radiology! @AIMI_Stanford (1/6) 👇
Magda Paschali tweet media
English
3
9
62
6.3K
Maya Varma retweetledi
Akshay Chaudhari
Akshay Chaudhari@Dr_ASChaudhari·
1/ Updates on our improved open-source CheXagent with new transparent benchmarks! We ran a new reader study mimicking real workflows: Radiology residents drafted reports that attendings reviewed/edited. Results from 8 rads show major efficiency gains. Key findings: 👇
English
2
7
27
3.9K
Maya Varma
Maya Varma@mayavarma23·
(3/4) We evaluate RaVL by introducing a large-scale evaluation framework with 654 fine-tuned VLMs annotated with ground-truth spurious correlations. We also show that RaVL can surface novel spurious correlations in off-the-shelf VLMs, both in the general and medical domains!
English
1
0
0
201
Maya Varma retweetledi
JB
JB@IAMJBDEL·
So proud of the release of the GREEN metric. See what you can do when you merge medical AI research and open-sourceness. 26,000 downloads on 🤗 Hugging Face and counting. 🟥 EMNLP proceedings: aclanthology.org/2024.findings-… 🤗 Dataset: huggingface.co/datasets/Stanf… 🤗 Models: huggingface.co/collections/St… 🤗 Space: huggingface.co/spaces/Stanfor… 🐍 GitHub: github.com/Stanford-AIMI/… 📄 Project Page: stanford-aimi.github.io/green.html 🧵 👇
JB tweet media
English
3
10
29
4.7K
Maya Varma retweetledi
Louis Blankemeier
Louis Blankemeier@loublanks·
🧙 Excited to introduce Merlin, a vision language foundation model for 3D computed tomography 🐈‍⬛🩻 Trained to understand 3D abdominal CT scans using supervision from: 💾 Structured electronic health records (1.8+ million codes) 🗒️ Natural language radiology reports (6+ million tokens) Paper: arxiv.org/abs/2406.06512 🧵 1/10
English
3
37
115
38.7K
Maya Varma retweetledi
Curt Langlotz
Curt Langlotz@curtlanglotz·
Five years ago, thanks to the leadership of @mattlungrenMD, @stanfordAIMI released the CheXpert images: 223K JPG CXRs with labels for 14 conditions. CheXpert has been cited >6000 times, mostly related to development of supervised learning methods. Much has changed since then.🧵
Curt Langlotz tweet media
English
3
34
128
23.9K
Maya Varma retweetledi
Zhihong Chen
Zhihong Chen@zhjohnchan·
⭐️ Excited to share our latest work about AI in healthcare. We present CheXagent, a foundation model for Chest X-ray interpretation. 📄 Paper: arxiv.org/abs/2401.12208 🌐 Website: stanford-aimi.github.io/chexagent.html 🧵 1/N
AK@_akhaliq

CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation paper page: huggingface.co/papers/2401.12… Chest X-rays (CXRs) are the most frequently performed imaging test in clinical practice. Recent advances in the development of vision-language foundation models (FMs) give rise to the possibility of performing automated CXR interpretation, which can assist physicians with clinical decision-making and improve patient outcomes. However, developing FMs that can accurately interpret CXRs is challenging due to the (1) limited availability of large-scale vision-language datasets in the medical image domain, (2) lack of vision and language encoders that can capture the complexities of medical data, and (3) absence of evaluation frameworks for benchmarking the abilities of FMs on CXR interpretation. In this work, we address these challenges by first introducing CheXinstruct - a large-scale instruction-tuning dataset curated from 28 publicly-available datasets. We then present CheXagent - an instruction-tuned FM capable of analyzing and summarizing CXRs. To build CheXagent, we design a clinical large language model (LLM) for parsing radiology reports, a vision encoder for representing CXR images, and a network to bridge the vision and language modalities. Finally, we introduce CheXbench - a novel benchmark designed to systematically evaluate FMs across 8 clinically-relevant CXR interpretation tasks. Extensive quantitative evaluations and qualitative reviews with five expert radiologists demonstrate that CheXagent outperforms previously-developed general- and medical-domain FMs on CheXbench tasks. Furthermore, in an effort to improve model transparency, we perform a fairness evaluation across factors of sex, race and age to highlight potential performance disparities.

English
6
28
84
22.1K
Maya Varma retweetledi
Simran Arora
Simran Arora@simran_s_arora·
Ran out of OpenAI credits?💰 We present a prompting strategy that enables open-source and off-the-shelf GPT-J-6B to outperform *few shot* GPT3-175B on 15 popular language benchmarks! 🚀 Paper and code: 📜 arxiv.org/abs/2210.02441 💻​ github.com/HazyResearch/a…
English
8
108
637
0
Maya Varma
Maya Varma@mayavarma23·
Excited to share Domino, which is appearing as an oral at #ICLR22! We use cross-modal embeddings to identify & describe systematic errors in ML models. 📜: arxiv.org/abs/2203.14960 Joint work w/ @EyubogluSabri @KhaledSaab11 @jdunnmon @james_y_zou @HazyResearch & others!
Sabri Eyuboglu@EyubogluSabri

Do you ever wonder if your model - despite logging impressive accuracy - is still failing on an important but unknown slice of your dataset? We certainly do! Stoked to share recent work @iclr22 in which we develop & evaluate ~slice discovery methods~ (1/7) ai.stanford.edu/blog/domino/

English
1
2
5
0
Maya Varma retweetledi
Mayee Chen
Mayee Chen@MayeeChen·
New preprint alert! 📣 How do we produce transferable and robust representations with supervised contrastive learning? We need *geometric spread* and an inductive bias towards *latent subclass clustering* in representation space. 📜 arxiv.org/abs/2204.07596 👇 (1/n)
Mayee Chen tweet media
English
2
54
243
0
Maya Varma retweetledi
Megan Leszczynski
Megan Leszczynski@m_leszczy·
New preprint alert! 📣 How do we improve long-tailed performance of entity retrieval? We use a supervised contrastive loss to *geometrically encode entity types* in representation space w/ bi-encoders. Check out our paper on TABi! 📜 arxiv.org/abs/2204.08173 Details👇 (1/n)
Megan Leszczynski tweet media
English
2
25
94
0
Maya Varma
Maya Varma@mayavarma23·
(6/6) A model pretrained on this dataset and injected with structural knowledge improves disambiguation of rare entities by between 2.5 and 57 accuracy points across two benchmark datasets!
English
0
0
0
0
Maya Varma
Maya Varma@mayavarma23·
(5/6) We utilize our integration scheme to augment structural resources and generate a large pretraining biomedical NED dataset (available at huggingface.co/datasets/mvarm…).
English
2
0
0
0