Maksim Kuznetsov

29 posts

Maksim Kuznetsov

@Max__Kuznetsov

Research Scientist at @InSilicoMeds

Montréal, Québec Katılım Ocak 2018

145 Takip Edilen125 Takipçiler

Maksim Kuznetsov retweetledi

Sarath Chandar@apsarathchandar·5 Mar

I am excited to share that our BindGPT paper won the best poster award at @RealAAAI #AAAI2025! Congratulations to the team! Work led by @artemZholus!

Sarath Chandar@apsarathchandar

What's the foundational model for generative chemistry? Our work, BindGPT, is a good candidate, and it will be presented at #AAAI2025 today! We built a simple transformer language model that beats diffusion models by just generating 3D molecules as text! Led by @artemZholus 1/n

English

5.6K

Maksim Kuznetsov@Max__Kuznetsov·28 Şub

7/ Finally, nach0-pc enables de novo ligand generation, designing molecules that bind to protein pockets.

English

113

Maksim Kuznetsov@Max__Kuznetsov·28 Şub

6/ By injecting noise into point clouds, nach0-pc can generate alternative molecular structures that retain reference molecule shape.

English

114

Maksim Kuznetsov@Max__Kuznetsov·28 Şub

1/ At @InSilicoMeds, we’re exploring how language models can process and generate 3D molecular structures. nach0-pc fuses a specialized text-based representation with a domain-specific encoder, enabling precise generation and conditioning on 3D molecular structures.

English

403

Maksim Kuznetsov@Max__Kuznetsov·28 Şub

🎉 Excited to announce that "nach0-pc: Multi-task Language Model with Molecular Point Cloud Encoder" will be presented at @RealAAAI! #AAAI2025 #AAAI25 📍 Poster #166 🕜 12:30 PM - 2:30 PM, March 1, 2025 📄 Paper: arxiv.org/abs/2410.09240

English

408

Maksim Kuznetsov retweetledi

Sarath Chandar@apsarathchandar·27 Şub

BindGPT is going to be presented at AAAI 25 today. Work led by my PhD student @artemZholus in collaboration with @InSilicoMeds Don’t miss 10/10 bindgpt.github.io arxiv.org/abs/2406.03686 huggingface.co/insilicomedici…

English

372

Maksim Kuznetsov@Max__Kuznetsov·1 Kas

@artemZholus @oliviaviessmann @BiologyAIDaily @FlagshipPioneer In the parallel study (arxiv.org/abs/2410.09240), we explored the concept of integrating the point cloud encoder to directly and efficiently (1 embedding by 1 atom) pass spatial molecular structures like protein pockets into the language model's input.

English

119

Artem Zholus@artemZholus·31 Eki

@oliviaviessmann @BiologyAIDaily @FlagshipPioneer Finally, in our work, we showed that this principle is so general that you can build a unified language model that solves many 3d generative tasks zero-shot bindgpt.github.io 9/9

English

135

Biology+AI Daily@BiologyAIDaily·28 Eki

Bio2Token: All-atom tokenization of any biomolecular structure with Mamba @FlagshipPioneer • This paper introduces “Bio2Token”, a method that tokenizes biomolecular structures at an all-atom level using Mamba. Unlike many current approaches that rely on coarse-grained residue-level representations, Bio2Token focuses on a more detailed atomic-level tokenization. • The innovation here lies in the use of quantized auto-encoders that learn atom-level representations, achieving reconstruction accuracies below and around 1 Ångström. • Mamba, a state space model, plays a key role by providing efficient and scalable encoding, overcoming computational limitations of traditional transformer-based models. Bio2Token can handle structures up to 95,000 atoms, which is significantly larger than the limit for many transformer models. • This approach not only achieves high accuracy but also uses fewer parameters and training resources compared to existing methods like AlphaFold-3 and ESM-3. • Bio2Token demonstrates versatility by tokenizing proteins, RNA, and small molecules, making it a flexible tool for biomolecular structure representation. • The quantized auto-encoders (QAE) efficiently transform 3D structures into 1D discrete tokens, allowing future integration with language models for biomolecular tasks. • The authors present domain-specific tokenizers (mol2token, protein2token, RNA2token) and a combined tokenizer (bio2token) that generalizes across different types of biomolecules. • Compared to ESM-3, Bio2Token achieves a lower reconstruction RMSE and superior performance across protein and RNA datasets, demonstrating its potential as a robust tool for accurate structural modeling. • The combination of Mamba-based architecture and quantized auto-encoders provides a lightweight yet powerful solution, avoiding the quadratic computational cost seen in transformers. • Limitations include ensuring chemical validity in reconstructed structures, as even small deviations can lead to unrealistic bonding. Future directions involve improving accuracy by adding more training data and integrating post-processing steps for chemical validity. @oliviaviessmann 📜Paper: arxiv.org/abs/2410.19110 #biomoleculardesign #proteinmodeling #machinelearning #stateSpaceModel #bioinformatics #Mamba #tokenization

English

188

43.2K

Maksim Kuznetsov@Max__Kuznetsov·20 May

Excited to present our new paper nach0 from @InSilicoMeds, in collaboration with @nvidia, published in @ChemicalScience: 📄 Paper: pubs.rsc.org/doi/d4sc00966e 💻 Code: github.com/insilicomedici… 🤗 Try it now on @huggingface: huggingface.co/insilicomedici…

English

1.9K

Maksim Kuznetsov@Max__Kuznetsov·29 Nis

Our new paper from @InSilicoMeds on neural conformation generation at @JCIM_JCTC “COSMIC: Molecular Conformation Space Modeling in Internal Coordinates with an Adversarial Framework” Paper: pubs.acs.org/doi/full/10.10… Code: github.com/insilicomedici…

English

1.7K

Maksim Kuznetsov@Max__Kuznetsov·5 Şub

Happy to present our latest results in @InSilicoMeds on molecular graph generation at #AAAI2021! Check out our joint work with @d_polykovskiy “MolGrow: A Graph Normalizing Flow for Hierarchical Molecular Generation” at poster session on 5-Feb, 08:45-10:30 AM & 04:45-06:30 PM PST

English

Maksim Kuznetsov@Max__Kuznetsov·9 Haz

@norpadon Для сложных пайплайнов есть ещё dvc.org, но его удобность под вопросом

Русский

Artur Chakhvadze@norpadon·9 Haz

@Max__Kuznetsov У меня слишком сложные пайплайны, там лайтнинга недостаточно

Русский

Artur Chakhvadze@norpadon·9 Haz

Каждый раз когда я начинаю новый ML проект я по нескольку дней аутирую потому что не знаю как написать трейнинг луп. Как сделать построение модели из конфига, как тестировать, как оранизовать пайплайн чтобы его не пришлось переписывать если захочется воткнуть в середину ган.

London, England 🇬🇧 Русский

Maksim Kuznetsov@Max__Kuznetsov·9 Haz

@norpadon github.com/PyTorchLightni…

QME

Maksim Kuznetsov@Max__Kuznetsov·9 Haz

@norpadon Чтобы избавиться от проблем с трейн лупом, наши предки принимали простой советский...

Русский

Maksim Kuznetsov retweetledi

Daniil Polykovskiy@d_polykovskiy·17 Nis

Our new paper with @InSilicoMeds: "Molecular Generation for Desired Transcriptome Changes With AAE". We propose a joint model that can sample molecules for a given transcriptome change and vise versa. Paper: frontiersin.org/articles/10.33… Code: github.com/insilicomedici…

English

Maksim Kuznetsov retweetledi

Lightning AI ⚡️@LightningAI·11 Nis

Molecular Generation for Desired Transcriptome Changes with Adversarial Autoencoders. "In this paper, we propose a new generative model that infers drug molecules that could induce a desired change in gene expression" In #pytorch #pytorchLightning github.com/insilicomedici…

English

Maksim Kuznetsov retweetledi

Daniil Polykovskiy@d_polykovskiy·5 Mar

Check out our #AISTATS paper from @InSilicoMeds: “Deterministic Decoding for Discrete Data in Variational Autoencoders”. Apparently, for lossless decoding, you need bounded support proposals! arxiv.org/abs/2003.02174

English

Keşfet

@RealAAAI @artemZholus @InSilicoMeds @oliviaviessmann @BiologyAIDaily @FlagshipPioneer @nvidia @ChemicalScience