Bôdi
627 posts

Bôdi
@agugumon
Aw aaawa aawaw aaaaaaa


O café com Deus pai na minha época









Excited to share CpGPT, a new foundation model for DNA methylation, crafted to understand the 'language' of epigenetics! 🧬🤖 What’s CpGPT? CpGPT is a transformer model pre-trained on CpGCorpus, a huge dataset of public DNA methylation info. By integrating sequence, positional, and epigenetic data, it captures complex epigenetic interactions from even a small part of the methylome. 📄 Preprint: biorxiv.org/cgi/content/sh… Why CpGPT? DNA methylation is key for gene expression and chromatin structuring, with CpG sites often varying in diseases like cancer and aging. Here’s what sets CpGPT apart: ✨ CpGPT Highlights: 🔧 Deep Pretraining: Trained on over 100K samples from 1,500+ studies, covering 1M+ CpG sites. 🗺️ Zero-Shot Skills: CpGPT imputes methylation from minimal data, mapping different Illumina arrays to a common reference. 🐭 Cross-Mammalian Reach: When fine-tuned, CpGPT accurately imputes methylation in new mammalian species. 📈 Strong Performance: Fine-tuned CpGPT ranks 2nd overall and 1st on the public leaderboard for age and mortality prediction in the Biomarkers of Aging Challenge. Special shoutout to Lucas (@ollimacsacul) for his outstanding leadership on this project! Big thanks as well to all the co-authors (@rv_sehgal, @prof_horvath, etc.) for their dedication and swift feedback throughout. We're committed to open science and are actively preparing the code. Expect the full code and model weights to be available soon—likely within the next few weeks!





















