Rahul Dhodapkar

109 posts

Rahul Dhodapkar

@rahuldhodapkar

physician-scientist, resident, computational biology, immunology, neuroscience, ophthalmology | MD @YaleMed | ex-software engineer @MongoDB

Katılım Temmuz 2014

134 Takip Edilen331 Takipçiler

Rahul Dhodapkar retweetledi

David van Dijk@david_van_dijk·4 Kas

C2S is now open for everyone. The biological LLM that learns the language of cells. Free for academic and commercial use. c2s.bio Join the growing community building with C2S. 🌱

English

192

27.6K

Rahul Dhodapkar retweetledi

David van Dijk@david_van_dijk·15 Eki

Exciting to see our collaboration with @Google highlighted here — using AI to generate and test new biological hypotheses!

Sundar Pichai@sundarpichai

An exciting milestone for AI in science: Our C2S-Scale 27B foundation model, built with @Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior, which scientists experimentally validated in living cells. With more preclinical and clinical tests, this discovery may reveal a promising new pathway for developing therapies to fight cancer.

English

293

33.4K

Rahul Dhodapkar@rahuldhodapkar·16 Eki

So proud to be a part of this groundbreaking effort - just the beginning of many discoveries, and new ways to improve health for us all

Sundar Pichai@sundarpichai

English

287

Rahul Dhodapkar retweetledi

Sundar Pichai@sundarpichai·15 Eki

English

543

3.2K

21.8K

6.9M

Rahul Dhodapkar retweetledi

David van Dijk@david_van_dijk·3 Eyl

🚀 Beyond excited to announce our release of the #Cell2Sentence (C2S) API and new foundation models! 🎉 Our C2S API makes it incredibly easy to convert #singlecell data into cell sentences, perform inference with LLM-based C2S models, fine-tune them, and convert cell sentences back into expression data—all in one seamless workflow. 🧬 We're releasing powerful new 410M parameter models designed for diverse tasks, including cell type prediction, cell generation, cell annotation, and cell embedding! 🌟 But there’s more: We provide the first foundation model that can encode multiple cells in context, opening up completely new possibilities in single-cell analysis! 🦄 Check out our tutorials to get started, explore the models on Hugging Face, and read the manuscript for more details. We can’t wait to see the innovative applications the community will dream up with these new tools. Stay tuned—more updates are on the way! 🔗 github.com/vandijklab/cel… 📝 biorxiv.org/content/10.110… 🤗 huggingface.co/vandijklab

English

245

37.7K

Rahul Dhodapkar@rahuldhodapkar·9 Mar

Excited to share this work - a new way to apply foundation models to graph structured data. Please reach out if interested in bringing any of these techniques to your data or use case!

David van Dijk@david_van_dijk

💡 Want to leverage the power of foundation models in graphs? 🔥 Introducing Foundation-Informed Message-Passing (FIMP), a framework for applying any pre-trained transformer-based foundation model to Graph Neural Networks! arxiv.org/abs/2210.09475

English

510

Rahul Dhodapkar@rahuldhodapkar·3 Mar

Proud to be a part of this fantastic effort!

Prof. Akiko Iwasaki@VirusesImmunity

Delighted to share our latest work on #longCOVID - sex differences in symptoms and immune signatures. Led by @SilvaJ_C @taka_takehiro @wood_jamie_1 et al. with @LeyingGuan & @PutrinoLab. We find a striking inverse correlation btw testosterone levels and symptom burden👇🏼 (1/) medrxiv.org/content/10.110…

English

13.2K

Rahul Dhodapkar retweetledi

Prof. Akiko Iwasaki@VirusesImmunity·3 Mar

English

883

2.3K

536.5K

Rahul Dhodapkar@rahuldhodapkar·19 Şub

@ylecun @yaroslavvb The problem with this assertion is that there are many other places where information can be encoded in the zygote beyond germline sequence - e.g. the physical orientation of DNA within the nucleus, subcellular sequestration of premade proteins etc. These are >>8MB

English

Yann LeCun@ylecun·18 Şub

Whatever pre-pre-training evolution has performed to make human use language, it has to squeeze into less then 8MB of genomic information. 8MB is an upper bound on the difference in information content between chimps and humans (1% of a genome with 3 billion pairs). That really not very much.

English

107

485

151K

Yaroslav Bulatov@yaroslavvb·18 Şub

Humans learn faster than machines, but that's just the "fine-tuning" part of the human. Pre-training part is a billion years of evolutionary feedback data. So the question is how to transfer some of that "pretrained" knowledge.

English

351

137.2K

Rahul Dhodapkar@rahuldhodapkar·16 Şub

It's been over a year now since I first proposed cell2sentence (biorxiv.org/content/10.110…) - a universal framework that allows *any LLM* to interface with single cell data. Now, together with @david_van_dijk and some incredibly talented students, I'm excited to share major progress

David van Dijk@david_van_dijk

Major Cell2Sentence update 🎉🔬! We’ve been thrilled to see the attention Cell2Sentence has received from the single-cell community. Now, we’re excited to release our first update of Cell2Sentence (C2S) - a framework to leverage LLMs to train foundational single-cell models, directly in text. What’s new & out: Updated preprint with latest results biorxiv.org/content/10.110… First full cell model available on the HuggingFace hub huggingface.co/vandijklab/pyt… Updated codebase for data transformation & training github.com/vandijklab/cel… We now fine-tune language models to generate entire cells, predict combinatorial cell labels, and generate textual data insights directly from cell sentences. We train GPT-2 and Pythia models on a large multi-tissue dataset containing 36M cells from @cellxgene as well as an immune tissue dataset containing 270k cells. C2S LMs achieve SOTA performance in single-cell data generation. C2S models trained for combinatorial label prediction settings excel in low-data regimes, outperforming single-cell foundation model baselines. We also show that C2S models benefit from natural language pre-training and always outperform models trained from scratch on cell sentences. C2S provides a straightforward approach to adapting LLMs for single-cell data analysis, leveraging their natural language capabilities to generate and derive insights from single cells. We are convinced that C2S’ approach of integrating data modalities through text is the way forward for single-cell foundation models, from representing multi-omics data to generating clinical insights, all in a human readable format. We’re excited to start building a community around Cell2Sentence! If you also think that C2S will be the framework for single-cell foundation models, and are interested in contributing, reach out to us! We welcome any collaborations and discussions. Huge thanks to our collaborator @aminkarbasi and the C2S team (@danielflevine, @sachalevy3, @SyedARizvi5688, @nazreenpm, Xingyu Chen, @dzhang03, @GhadermarziSina, Ruiming Wu, Ivan Vrkic, Anna Zhong, Daphne Raskin, Insu Han, @aho_fonseca, @josueortc) for their hard work on C2S! Special thanks to @rahuldhodapkar, who co-supervises this project.

English

541

Rahul Dhodapkar@rahuldhodapkar·8 Kas

Some very cool insights here into the intersection between human labeling and other distance-based "unsupervised" approaches to classification! Exciting work!

Maria Brbic@mariabrbic

How to infer human labelling of a given dataset in a model-agnostic way? Check our new method HUME accepted at @NeurIPSConf as #spotlight!🌟 HUME provides a new view to tackle unsupervised learning. Kudos to my fantastic PhD student @artygadetsky! Paper arxiv.org/abs/2311.02940

English

339

Rahul Dhodapkar retweetledi

Nature Methods@naturemethods·2 Kas

CINEMA-OT, developed by @david_van_dijk, @mingze7316 and colleagues is a causal-inference based method for analyzing the effects of single cell perturbation experiments. @ishizukalab, @ellenfoxman, @rahuldhodapkar, @aho_fonseca nature.com/articles/s4159…

English

17.8K

Rahul Dhodapkar@rahuldhodapkar·2 Kas

Happy to share this collaboration with @david_van_dijk @ishizukalab @EllenFoxman - a new causal method to infer perturbational effects with single cell resolution! Amazing work by @Mingze7316

David van Dijk@david_van_dijk

Thrilled to announce that CINEMA-OT is now published at Nature Methods! nature.com/articles/s4159…

English

2.4K

Rahul Dhodapkar@rahuldhodapkar·27 Eyl

Extremely excited to share our work on #LongCovid, now out in #Nature! I'm honored to be part of an amazing team contributing to our knowledge of a disease affecting so many lives worldwide. Very clear that this disease has *objectively measurable* immune characteristics.

Prof. Akiko Iwasaki@VirusesImmunity

So pleased to report that our Mount Sinai-Yale long COVID (MY-LC) paper with @putrinolab & others is now published!! Proud of the hard work of all who contributed. We found biological signatures that can distinguish people with vs. without #longCOVID (1/) nature.com/articles/s4158…

English

13.7K

Rahul Dhodapkar@rahuldhodapkar·15 Eyl

Very proud to share this collaboration with @david_van_dijk and team, where we show a new fundamental approach that allows language-pretrained LLMs to be used *without architectural modifications* to learn from #singlecell data. Please check it out!

David van Dijk@david_van_dijk

Single Cells as text? We developed Cell2Sentence, a method that allows training of Large Language Models on single-cell data! biorxiv.org/content/10.110… With @danielflevine @SyedARizvi5688 @sachalevy3 @rahuldhodapkar @YaleSEAS @YaleMed #AI #ML #NLP #genomics #CompBio #singlecell

English

4.1K

Rahul Dhodapkar retweetledi

David van Dijk@david_van_dijk·14 Eyl

Introducing BrainLM 🧠🤖the first foundation model for #fMRI analysis trained on 6,700 hours of brain activity data! Fine-tune for specialized tasks or leverage zero-shot inference capabilities! @WuTsaiYale @YaleCompsci @YaleCBB @YaleMed biorxiv.org/content/10.110…

English

102

357

96.4K

Rahul Dhodapkar retweetledi

Madhav Dhodapkar@MadhavDhodapkar·6 Tem

Then and NowHarnessing Immunity in Myeloma: Wine That Keeps Getting Better? ashpublications.org/thehematologis…

English

5.6K

Rahul Dhodapkar retweetledi

Prof. Akiko Iwasaki@VirusesImmunity·5 May

A new study in @SciImmunology led by @AnisBarmada & Jon Klein @YaleIBIO with @lucasite_lab @InciYildirim11 @YalePediatrics teams explored immune signatures of people who developed myocarditis after mRNA vaccines. Here is what we found. 🧵 (1/) science.org/doi/10.1126/sc…

English

253

673

238.2K

Rahul Dhodapkar retweetledi

Rahul Satija@satijalab·11 Nis

We are excited to release Seurat v5- with new methods for multimodal, spatially resolved, and massively scalable single-cell analysis. satijalab.org/seurat

English

269

116.3K

Rahul Dhodapkar@rahuldhodapkar·27 Mar

Perhaps this is a good way to avoid the bias of fixating on the genes we already "know" and the processes we are already familiar with!

English

118

Rahul Dhodapkar@rahuldhodapkar·27 Mar

When asked to generate citations/supporting evidence for the purported functions, ChatGPT confidently generates some bogus references, but I think it's still a great way to broaden thinking, and identify new processes to follow up on using good old-fashioned PubMed search

English

286

Rahul Dhodapkar@rahuldhodapkar·27 Mar

I've been playing around with using #ChatGPT to help think about and process differential expression gene lists and found that very simple prompts are able to do reasonably well in generating high-level overviews of known gene functions, just pasting from #Seurat `FindMarkers`

English

467

Keşfet

@Google @Yale @SilvaJ_C @taka_takehiro @LeyingGuan @PutrinoLab @ylecun @yaroslavvb