Alejandro Lozano

48 posts

Alejandro Lozano banner
Alejandro Lozano

Alejandro Lozano

@Ale9806_

Ph.D. Student @ Stanford AI Lab Building open biomedical AI

Stanford, California Katılım Mayıs 2023
58 Takip Edilen171 Takipçiler
Alejandro Lozano retweetledi
Yuhui Zhang
Yuhui Zhang@Zhang_Yu_hui·
🧬 What if we could build a virtual cell to predict how it responds to drugs or genetic perturbations? Super excited to introduce CellFlux at #ICML2025 — an image generative model that simulates cellular morphological changes from microscopy images. yuhui-zh15.github.io/CellFlux/ 💡 Key Innovation: We frame perturbation prediction as a distribution-to-distribution learning problem, mapping control cells to perturbed cells within the same batch to mitigate biological batch artifacts, and solve it using flow matching. 📊 Results: ✅ 35% higher image fidelity ✅ 12% greater biological accuracy ✅ New capabilities: batch effect correction & trajectory modeling 🎯 Impact: Paving the way for faster drug discovery and deeper biological insights. 📍 Poster: Tue 7/15, 11:00 AM – 1:30 PM, #301 🙏Huge thanks to all my brilliant collaborators: @hhhhh2033528 (co-lead; now Harvard AIM PhD! 🎉) @ChenyuW64562111 @TianhongLi6 @zoewefers @jnirsch @jmhb0 @DaisyYDing @Ale9806_ @Prof_Lundberg @yeung_levy! #MachineLearning #ComputerVision #ComputationalBiology #DrugDiscovery #AIforScience #CVforScience
English
3
16
51
8.5K
James Burgess
James Burgess@jmhb0·
I'm at CVPR! Come see me at one of my posters, or reach out for a chit chat. MicroVQA: reasoning llm benchmark in biology Sat 5-7pm, hall D, poster #357 jmhb0.github.io/microvqa/ BIOMEDICA: a massive vision-language dataset Sat 5-7pm, hall D, poster #374 minwoosun.github.io/biomedica-webs…
James Burgess@jmhb0

Introducing MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research #CVPR2025 ✅ 1k multimodal reasoning VQAs testing MLLMs for science 🧑‍🔬 Biology researchers manually created the questions 🤖 RefineBot: a method for fixing QA language shortcuts 🧵

English
3
0
25
915
Alejandro Lozano retweetledi
Josiah Aklilu
Josiah Aklilu@AkliluJosiah2·
There’s growing excitement around VLMs and their potential to transform surgery🏥—but where exactly are we on the path to AI-assisted surgical procedures? In our latest work, we systematically evaluated leading VLMs across major surgical tasks where AI is gaining traction..🧵
English
2
6
30
9.2K
Alejandro Lozano
Alejandro Lozano@Ale9806_·
@xhluca Not only useful but necessary! Thanks for making such an amazing piece of code.
English
1
0
2
75
Xing Han Lu
Xing Han Lu@xhluca·
@Ale9806_ Congratulations! Glad to know BM25S was useful for designing the index :)
English
1
0
2
104
Alejandro Lozano
Alejandro Lozano@Ale9806_·
Earlier this year, we released the BIOMEDICA dataset, featuring 24 million unique image caption pairs and 30 million image references derived from open-source biomedical literature. It's been great to see the community engaging with it—we're currently seeing around 6K downloads per month. I'm excited to share our latest update, showcasing the new additions to the BIOMEDICA framework: 📄 Brief Announcement: lnkd.in/gVjgT8me 🗄️ BIOMEDICA Index:lnkd.in/gxD4YwDa 🌐 Website: lnkd.in/gGrbZaRu Contributions: 🗄️ The BIOMEDICA Index: A hybrid vector-BM25 database capable of querying similar images, captions, and full-text articles with language and image queries! 🎯 BMC-SmolVLM: A small yet powerful biomedical VLM. With only 2.2B parameters, BMC-SmolVLM achieves performance comparable to biomedical VLMs with 7–13B parameters. 🤖 BMC-Agent: By leveraging the BIOMEDICA Index, BMC-Agent can retrieve and analyze relevant articles given a biomedical query. We demonstrate that this plug-and-play approach enables LLMs to answer questions based on newly modified medical knowledge—even when the base models initially get the answer wrong. Lastly, as part of our release, we’re introducing the BIOMEDICA Subsets— a collection of pre-filtered datasets created in response to community requests.
Alejandro Lozano tweet media
English
3
9
27
3.6K
Alejandro Lozano
Alejandro Lozano@Ale9806_·
@Himanshu_nitrr Valid question! Given the broad space in biomedical research (EHR-derived datasets, scientific-derived datasets, human-aggregated metadata), we specifically compare our work to scalable methods derived from scientific literature!"
English
0
0
0
22
Alejandro Lozano
Alejandro Lozano@Ale9806_·
Biomedical datasets are often confined to specific domains, missing valuable insights from adjacent fields. To bridge this gap, we present BIOMEDICA: an open-source framework to extract and serialize PMC-OA. 📄Paper: lnkd.in/dUUgA6rR 🌐Website: lnkd.in/dnqZZW4M
Alejandro Lozano tweet media
English
13
51
143
51.7K
Myeong 명 👩‍💻
Myeong 명 👩‍💻@myeong_official·
@Ale9806_ This is a great step forward! How does the integration of different metadata types (captions, figure references, etc.) enhance the overall usefulness of the dataset for training models?
English
1
0
0
23
Alejandro Lozano
Alejandro Lozano@Ale9806_·
Introducing video differencing, a new task for detecting differences between video frames. Notably, even the most advanced Video LLMs struggle with this challenge, underscoring the long road ahead!
James Burgess@jmhb0

🚨Large video-language models LLaVA-Video can do single-video tasks. But can they compare videos? Imagine you’re learning a sports skill like kicking: can an AI tell how your kick differs from an expert video? 🚀 Introducing "Video Action Differencing" (VidDiff), ICLR 2025 🧵

English
0
0
4
261
Alejandro Lozano retweetledi
Kevin Wu
Kevin Wu@kevinywu·
Which LLMs work best for medical queries? 🩺✨ Introducing MedArena 🏥—the first chatbot arena just for clinicians ⚕️! 👩‍⚕️👨‍⚕️ Licensed healthcare professionals can submit medical queries and compare + rank responses from two randomly selected LLMs (such as o1, Gemini 🌟, and Perplexity 🤖) 🏆. 💡 Your participation will be a valuable contribution to an open-source dataset of real-world queries and preferences for medicine 📊🔍. 🔗 Visit medarena.ai to try it out! 🚀 @james_y_zou @ericwu93 #MedTwitter 💬 #MedTech 🧬 #DigitalHealth 🌐 #MedicalAI 🤝
English
1
8
17
3.5K
Alejandro Lozano
Alejandro Lozano@Ale9806_·
Check out our new work accepted to ICLR 2025. We introduce time-to-event (TTE) pretraining to leverage temporal supervision from longitudinal EHR data and estimate the risk of future events. By scaling to 225M clinical events, we achieve SOTA prognostic performance!
Frazier Huo@Zepeng_Huo

🎉 Excited to share that our latest research, 𝘛𝘪𝘮𝘦-𝘵𝘰-𝘌𝘷𝘦𝘯𝘵 𝘗𝘳𝘦𝘵𝘳𝘢𝘪𝘯𝘪𝘯𝘨 𝘧𝘰𝘳 3𝘋 𝘔𝘦𝘥𝘪𝘤𝘢𝘭 𝘐𝘮𝘢𝘨𝘪𝘯𝘨, has been accepted at 𝗜𝗖𝗟𝗥 2025! 🚀 🔍 𝗜𝗺𝗽𝗿𝗼𝘃𝗶𝗻𝗴 𝗠𝗲𝗱𝗶𝗰𝗮𝗹 𝗜𝗺𝗮𝗴𝗲 𝗣𝗿𝗲𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗧𝗶𝗺𝗲-𝘁𝗼-𝗘𝘃𝗲𝗻𝘁 (𝗧𝗧𝗘) 𝗠𝗼𝗱𝗲𝗹𝗶𝗻𝗴 While self-supervised methods in medical imaging have significantly enhanced diagnostic accuracy and image segmentation performance, they struggle with 𝗽𝗿𝗼𝗴𝗻𝗼𝘀𝗶𝘀—the prediction of future health outcomes. Reliable prognosis is essential for assessing disease progression and guiding clinical decision-making. Our framework addresses this gap by combining time-to-event (TTE) modeling with longitudinal EHRs to pretrain a 3D image encoder for outcome prediction, significantly improving prognostic prediction using only imaging data. 🔥 Key Highlights • Time-to-event (TTE) pretraining: Predicts time until critical clinical events by learning 3D imaging biomarkers. • Massive scale: Trained on 8,192 TTE tasks and 18,945 CT scans linked to EHR data containing 225M clinical events—the largest paired EHR+3D imaging research dataset currently available • Superior prognostic performance: 𝟮𝟯.𝟳% 𝗔𝗨𝗥𝗢𝗖 𝗯𝗼𝗼𝘀𝘁, 𝟮𝟵.𝟰% 𝗛𝗮𝗿𝗿𝗲𝗹𝗹’𝘀 𝗖-𝗶𝗻𝗱𝗲𝘅 𝗶𝗺𝗽𝗿𝗼𝘃𝗲𝗺𝗲𝗻𝘁, 𝗮𝗻𝗱 𝟱𝟰% 𝗯𝗲𝘁𝘁𝗲𝗿 𝗰𝗮𝗹𝗶𝗯𝗿𝗮𝘁𝗶𝗼𝗻 without sacrificing diagnostic accuracy. 🛠️ How It Works • Transform patient EHR timelines into large-scale TTE pretraining tasks, predicting time distributions until key medical events. • A 3D vision encoder processes CT scans, generating embeddings for a task-specific TTE head (e.g., Cox models, survival networks). • Enables AI to capture long-term health trajectories for better risk stratification and survival analysis. 📖 Read more 📄 Paper: lnkd.in/gfnGrNgS 🗂️ Dataset: EHR lnkd.in/gAxwvXdw Imaging lnkd.in/gWKPrPir 💻 Code: Coming Soon! 🤗 Hugging Face Models: Coming Soon! 🙌 Huge thanks to my amazing co-leads: @jasonafries , @Ale9806_ and co-authors @jeyamariajose, Ethan Steinberg, @loublanks, @Dr_ASChaudhari, @curtlanglotz, @drnigam Excited to push medical imaging AI forward—stay tuned for tutorials & deep dives! 🏥🔬 hashtag#ICLR2025 hashtag#AI hashtag#MedicalImaging hashtag#TTE hashtag#SurvivalAnalysis hashtag#MultiModalAI

English
0
1
3
446
Alejandro Lozano retweetledi
Xiaohan Wang
Xiaohan Wang@XiaohanWang96·
🚀 Introducing Temporal Preference Optimization (TPO) – a video-centric post-training framework that enhances temporal grounding in long-form videos for Video-LMMs! 🎥✨ 🔍 Key Highlights: ✅ Self-improvement via preference learning – Models learn to differentiate well-grounded from inaccurate responses without manual annotations. ✅ Multi-level temporal grounding – Effectively captures both localized segments and comprehensive video sequences. ✅ Efficient and scalable – Utilizes only 10k video QA pairs for post-training and can scale seamlessly to larger datasets. ✅ Proven performance – TPO enhances two state-of-the-art Video-LMMs on LongVideoBench, MLVU, and Video-MME. 🔥 LLaVA-Video-TPO is now the top-performing 7B model on Video-MME, highlighting TPO's potential in advancing temporal reasoning. This work was co-led with the talented undergraduate @rui__li, alongside fantastic collaborators @Zhang_Yu_hui and Zeyu Wang, and advised by @yeung_levy!
Xiaohan Wang tweet media
English
1
10
28
3.1K
Alejandro Lozano
Alejandro Lozano@Ale9806_·
[9/10] Shout out to my stellar first co-authors @minwsun and @jmhb0 for leading this effort, as well as the incredible team of computer scientists, statisticians, biologists, and clinicians that made this possible:
English
0
0
2
324
Alejandro Lozano
Alejandro Lozano@Ale9806_·
[8/10] While our models offer state-of-the-art performance, all evaluations indicate that there is still significant room for improvement. We release all our contributions under a permissive license to facilitate broader use and further development
English
0
0
2
260
Alejandro Lozano
Alejandro Lozano@Ale9806_·
[7/10] 💡 We demonstrate the utility and accessibility of our resource by training BMC-CLIP, a suite of CLIP-style models continuously pre-trained on our dataset using different training recipes via streaming.
Alejandro Lozano tweet media
English
0
0
2
336
Alejandro Lozano
Alejandro Lozano@Ale9806_·
[6/10] 🎯 We demonstrate the utility and accessibility of our resource by training BMC-CLIP, a suite of CLIP-style models continuously pre-trained on our dataset using different training recipes via streaming.
English
0
0
2
232
Alejandro Lozano
Alejandro Lozano@Ale9806_·
[5/10] 💵 Our archive is hosted in HuggingFace, enabling streaming. Eliminating the need to download 3.9 TB of data locally in order to use BIOMEDICA.
English
0
0
2
226