Quentin Fournier

33 posts

Quentin Fournier

@qfournier2

Research Fellow at @Mila_Quebec working on language models for #drugdiscovery 🧬

Montreal Katılım Nisan 2023

34 Takip Edilen55 Takipçiler

Quentin Fournier retweetledi

Darshan Patil@dapatil211·5 Mar

🧬 New paper Scientific datasets evolve as science evolves. With proteins, new sequences get added, annotations get corrected, and noisy entries get curated out. Introducing CoPeP, a continual-pretraining benchmark for protein LMs. Details 🧵 1/n

English

8.5K

Quentin Fournier retweetledi

Mila - Institut québécois d'IA@Mila_Quebec·22 Ara

Congrats to Prashant (@prashantg_17), Davide (@DavideBald42296 ), Quentin (@qfournier2), and Sarath (@apsarathchandar) on CADmium, a new method that rethinks text-to-CAD to generate high-fidelity 3D models! Read their blog post: mila.quebec/en/article/imp…

English

4.6K

Quentin Fournier retweetledi

Sarath Chandar@apsarathchandar·24 Eki

I am recruiting several graduate students (both MSc and PhD level) for Fall 2026 @ChandarLab! The application deadline is December 01. Please apply through the @Mila_Quebec supervision request process here: mila.quebec/en/prospective…. More details about the recruitment process here: chandar-lab.github.io/join/

English

157

581

50.4K

Quentin Fournier retweetledi

Sarath Chandar@apsarathchandar·3 Eki

At @ChandarLab, we are happy to announce the third edition of our assistance program to provide feedback for members of communities underrepresented in AI who want to apply to high-profile graduate programs. Want feedback? Details: chandar-lab.github.io/grad_app/. Deadline: Nov 01! cc: @Mila_Quebec, @polymtl, @CIFAR_News

English

17.3K

Quentin Fournier retweetledi

Sarath Chandar@apsarathchandar·22 Ağu

Molecules speak in atoms and bonds. LLMs can learn that language. Even with SOTA #denovo design, our largest molecular LLM study finds a plot twist: early saturation, weak scaling, and proxy metrics that mislead on real tasks! Led by @kchitsaz and @roshan_msb 🧵 More in thread:

English

4.9K

Quentin Fournier retweetledi

Jack Morris@jxmnop·24 Haz

In the beginning, there was BERT. Eventually BERT gave rise to RoBERTa. Then, DeBERTa. Later, ModernBERT. And now, NeoBERT. The new state-of-the-art small-sized encoder:

English

912

76.4K

Quentin Fournier retweetledi

Biology+AI Daily@BiologyAIDaily·24 May

Structure-Aligned Protein Language Model １．Structure-Aligned Protein Language Model (SaPLM) augments sequence-only protein language models (pLMs) with structural knowledge by aligning their representations with those from pre-trained protein graph neural networks (pGNNs), achieving large gains on structure-aware tasks without compromising sequence generality. ２．The key innovation is a dual-task framework: (1) a latent-level contrastive learning task that aligns residue embeddings from the pLM and pGNN across different proteins, capturing inter-protein structural patterns, and (2) a physical-level task that predicts structural tokens from pLM outputs, encoding intra-protein structural geometry. ３．To avoid noisy or overly simple residues during training, a residue loss selection module is introduced. It selects residue losses with high excess learning potential by comparing the current model’s losses to a high-quality reference model trained on curated structures. ４．Applying this structure alignment method to ESM2 and AMPLIFY yields SaESM2 and SaAMPLIFY, which significantly outperform their unaligned counterparts on multiple benchmarks. SaESM2 improves contact prediction P@L/5 by 13% and stability prediction Spearman correlation by 4.5%. ５．Unlike prior models that use structural input during inference (e.g., structure tokens), SaPLM requires only sequences at inference time, maintaining the generality and scalability of pLMs while enhancing structural reasoning via pretraining. ６．On mutation effect prediction tasks, SaESM2 achieves the highest performance on binding fitness (GB1) and stability prediction, outperforming even ESM2-s and ISM models explicitly trained on these tasks. ７．SaESM2 also achieves state-of-the-art results on 6 out of 9 downstream property prediction tasks (e.g., metal binding, DeepLoc, EC number classification), showing the effectiveness of structural alignment for biologically relevant function prediction. ８．Ablation studies confirm that both the latent- and physical-level tasks are critical for performance, with the latent-level alignment contributing the most. Replacing the GearNet embeddings with AlphaFold Evoformer embeddings significantly degrades performance. ９．Residue embedding visualization using UMAP shows that aligned models (SaESM2, SaAMPLIFY) learn more structured latent spaces, with better separation of secondary structure types and physicochemical similarity among amino acids. １０．Structure alignment emerges as a generalizable and efficient way to enrich pLMs with structural context using only sequence input. SaESM2 and SaAMPLIFY set a new benchmark for structure-aware yet sequence-only protein modeling. 💻Code: github.com/chandar-lab/AM… 📜Paper: arxiv.org/abs/2505.16896 #ProteinLLM #StructureAlignment #pLM #ProteinStructure #ContrastiveLearning #GNN #ComputationalBiology #SaESM2 #AMPLIFY #AI4Science

English

102

6.3K

Quentin Fournier retweetledi

Carl Doersch@CarlDoersch·9 Nis

We're very excited to introduce TAPNext: a model that sets a new state-of-art for Tracking Any Point in videos, by formulating the task as Next Token Prediction. For more, see: tap-next.github.io 🧵

English

377

45.2K

Quentin Fournier retweetledi

Sarath Chandar@apsarathchandar·4 Nis

Can better architectures & representations make self-play enough for zero-shot coordination? 🤔 We explore this in our ICLR 2025 paper: A Generalist Hanabi Agent. We develop R3D2, the first agent to master all Hanabi settings and generalize to novel partners! 🚀 #ICLR2025 1/n

English

9.3K

Quentin Fournier retweetledi

Sarath Chandar@apsarathchandar·21 Mar

In my lab, we have not one but four open postdoc positions! These positions cover developing foundation models for text, proteins, small molecules, genomic data, time series data, and astrophysics data! If you have strong research expertise and a PhD in LLMs and Foundation Models, and you are willing to learn about domain-specific problems and collaborate with domain experts, this is an ideal position for you! Actual links in the next tweet! 1/2

English

116

28.4K

Quentin Fournier retweetledi

Mila - Institut québécois d'IA@Mila_Quebec·3 Mar

Congratuations to Lola (@lo_LB_La) and Sarath (@apsarathchandar) Read their blog post: mila.quebec/en/article/neo…

Sarath Chandar@apsarathchandar

2025 BERT is NeoBERT! We have fully pre-trained a next-generation encoder for 2.1T tokens with the latest advances in data, training, and architecture. This is a heroic effort from my PhD student @lo_LB_La in collaboration with @qfournier2 and Mariam El Mezouar (1/n)

English

5.4K

Quentin Fournier retweetledi

1LittleCoder💻@1littlecoder·1 Mar

A new BERT baby! If you are still using the huge RoBERTa or DeBERTa for your NLP tasks, here's NeoBERT!

English

Quentin Fournier retweetledi

Sarath Chandar@apsarathchandar·5 Mar

I am excited to share that our BindGPT paper won the best poster award at @RealAAAI #AAAI2025! Congratulations to the team! Work led by @artemZholus!

Sarath Chandar@apsarathchandar

What's the foundational model for generative chemistry? Our work, BindGPT, is a good candidate, and it will be presented at #AAAI2025 today! We built a simple transformer language model that beats diffusion models by just generating 3D molecules as text! Led by @artemZholus 1/n

English

5.6K

Quentin Fournier retweetledi

Sarath Chandar@apsarathchandar·28 Şub

English

30K

Quentin Fournier retweetledi

CoLLAs 2026@CoLLAs_Conf·15 Kas

📢 Exciting News! The Fourth Conference on Lifelong Learning Agents (CoLLAs 2025) will be held at the University of Pennsylvania (@Penn) in Philadelphia, USA 🇺🇸 🗓️ Important Dates: Abstract Deadline: Feb 21, 2025 Submission Deadline: Feb 26, 2025 Conference Dates: Aug 11 - Aug 14, 2025 We invite submissions that present new theories, methodologies, applications, or insights into algorithms and benchmarks designed for non-i.i.d. and non-stationary settings. Accepted papers will be published in the Proceedings of Machine Learning Research (PMLR). 📚 Full CFP: lifelong-ml.cc/Conferences/20… #CoLLAs2025 #AI #MachineLearning #ContinualLearning #LifelongLearning #ResearchConference #CallForPapers #NonStationaryLearning

English

57.4K

Quentin Fournier retweetledi

Sarath Chandar@apsarathchandar·11 Kas

After many years, I will be attending @emnlpmeeting EMNLP! @ChandarLab will present three papers at EMNLP 2024 (details in the thread). I am also recruiting Ph. D.s/postdocs, so please email me if you are attending EMNLP and are interested in chatting about these positions! 1/n

English

1.2K

Quentin Fournier@qfournier2·2 Eki

🚨 Exciting news! Our state-of-the-art protein language models are now available on Hugging Face 🤗 Discover AMPLIFY at huggingface.co/chandar-lab and start experimenting today!

English

Quentin Fournier retweetledi

Sarath Chandar@apsarathchandar·27 Eyl

Are you finishing your PhD in LLMs and are looking for a postdoctoral position? Come join Prof. Amal Zouaq and me at @Mila_Quebec! We have multiple openings for postdoctoral candidates in NLP/LLM. Details: shorturl.at/gwpgG Deadline: 30th October. Please retweet for maximum reach!

English

Quentin Fournier retweetledi

Leo Zang@LeoTZ03·27 Eyl

Recently added into Database 1. Protein-Mamba: Biological Mamba Models for Protein Function Prediction arxiv.org/abs/2409.14617 2. Protein Language Models: Is Scaling Necessary? biorxiv.org/content/10.110… 3. PepINVENT: Generative peptide design beyond the natural amino acids arxiv.org/abs/2409.14040 4. Navigating Chemical Space with Latent Flows arxiv.org/abs/2405.03987 5. DiffPaSS -- High-performance differentiable pairing of protein sequences using soft scores arxiv.org/abs/2409.16142 6. Structure-based Drug Design with Equivariant Diffusion Models arxiv.org/abs/2210.13695 7. dnaGrinder: a lightweight and high-capacity genomic foundation model arxiv.org/abs/2409.15697 8. Evaluating the representational power of pre-trained DNA language models for regulatory genomics biorxiv.org/content/10.110…

English

7.9K

Quentin Fournier retweetledi

owl@owl_posting·26 Eyl

biorxiv.org/content/10.110… As previously mentioned, AlphaFold2’s confidence has been used to predict protein disorder (Ruff and Pappu, 2021), but we show that AlphaFold2 cannot differentiate between disordered proteins and non-protein sequences, whereas AMPLIFY can. Figure 2H demonstrates that AMPLIFY embedding similarity can separate human disordered proteins (> 25% intrinsically disordered in the DisProt database) (Aspromonte et al., 2024) from hypothetical proteins (PE=5 and no annotation for localization) at a ROC-AUC of 0.94, compared to a score of 0.44 when using AlphaFold2’s pLDDT confidence metric. This demonstrates that training solely on available structures is insufficient for a comprehensive understanding of protein behavior. neat

English

13.4K

Keşfet

@prashantg_17 @DavideBald42296 @apsarathchandar @ChandarLab @Mila_Quebec @polymtl @CIFAR_News @KChitsaz