Dimitriadis Nikos (@nikdimitriadis) - Twitter Profili

Sabitlenmiş Tweet

Fine-tuning pre-trained models leads to catastrophic forgetting, gains on one task cause losses on others. These issues worsen in multi-task merging scenarios. Enter LiNeS 📈, a method to solve them with ease. 🔥 🌐: lines-merging.github.io 📜: arxiv.org/abs/2410.17146 🧵 1/11

English

6

57

254

31K

Dimitriadis Nikos retweetledi

Ke Wang@wangkeml·3 Eki

Bit late for the announcements but very happy to share that MEMOIR is accepted to Neurips 2025🎉! Great collaboration with @qinym710 @nikdimitriadis, @alesfav, @pafrossard! See you in San diego!

Yiming Qin@qinym710

How can we inject new knowledge into LLMs without full retraining, forgetting, or breaking past edits? We introduce MEMOIR 📖— a scalable framework for lifelong model editing that reliably rewrites thousands of facts sequentially using a residual memory module. 🔥 🧵1/7

English

0

4

8

1.3K

Dimitriadis Nikos retweetledi

Abdellah Rahmani@arahmani_AR·24 Eyl

🎉 Thrilled to share: our paper FANTOM with Prof. @pafrossard, Flow-based approach for Dynamic Temporal Causal models with non-Gaussian or Heteroscedastic Noises, has been accepted at NeurIPS 2025! (1/6)

English

1

8

16

1.3K

Dimitriadis Nikos retweetledi

Arthur Douillard@Ar_Douillard·18 Eyl

Two DiLoCo-related papers accepted at NeurIPS. My distributed learning team at GDM is thriving. More bullish than ever on distributed learning.

English

4

10

142

17.6K

Dimitriadis Nikos retweetledi

Maksym Andriushchenko@maksym_andr·6 Ağu

🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨 Hiring. I'm looking for multiple PhD students: both those able to start in Fall 2025 (i.e., as soon as possible) and through centralized programs like CLS, IMPRS, and ELLIS (the deadlines are in November) to start in Spring–Fall 2026. I'm also searching for postdocs, master's thesis students, and research interns. Fill the Google form below if you're interested! Research group. We will focus on developing algorithmic solutions to reduce harms from advanced general-purpose AI models. We're particularly interested in alignment of autonomous LLM agents, which are becoming increasingly capable and pose a variety of emerging risks. We're also interested in rigorous AI evaluations and informing the public about the risks and capabilities of frontier AI models. Additionally, we aim to advance our understanding of how AI models generalize, which is crucial for ensuring their steerability and reducing associated risks. For more information about research topics relevant to our group, please check the following documents: - International AI Safety Report, - An Approach to Technical AGI Safety and Security by DeepMind, - Open Philanthropy’s 2025 RFP for Technical AI Safety Research. Research style. We are not necessarily interested in getting X papers accepted at NeurIPS/ICML/ICLR. We are interested in making an impact: this can be papers (and NeurIPS/ICML/ICLR are great venues), but also open-source repositories, benchmarks, blog posts, even social media posts—literally anything that can be genuinely useful for other researchers and the general public. Broader vision. Current machine learning methods are fundamentally different from what they used to be pre-2022. The Bitter Lesson summarized and predicted this shift very well back in 2019: "general methods that leverage computation are ultimately the most effective". Taking this into account, we are only interested in studying methods that are general and scale with intelligence and compute. Everything that helps to advance their safety and alignment with societal values is relevant to us. We believe getting this—some may call it "AGI"—right is one of the most important challenges of our time. Join us on this journey!

English

76

90

841

105.7K

Dimitriadis Nikos retweetledi

Skander Moalla@SkanderMoalla·14 Tem

🚀 Big time! We can finally do LLM RL fine-tuning with rewards and leverage offline/off-policy data! ❌ You want rewards, but GRPO only works online? ❌ You want offline, but DPO is limited to preferences? ✅ QRPO can do both! 🧵Here's how we do it:

English

3

37

144

25.2K

Dimitriadis Nikos retweetledi

Vincent Jung@jungvinc·19 Haz

🧬 New roadmap out in Nature Reviews Molecular Cell Biology! 🤖 We show how RNA-LMs + GNNs can come together to model the RNA interactome & uncover new roles for non-coding RNA. 💊 Clinical links to RNA therapies for cancer & neuro diseases. 📄 Read it: bit.ly/4kNblk6

Nature Reviews Molecular Cell Biology@NatRevMCB

New Online! Decoding the interactions and functions of non-coding RNA with artificial intelligence bit.ly/4kNblk6

English

0

5

12

1.1K

Dimitriadis Nikos retweetledi

Ke Wang@wangkeml·13 Haz

Very happy to share our new work: MEMOIR 📖! MEMOIR is a lifelong model editing method to edit factual knowledge in LLMs with minimal forgetting even with thousands of edits!

Yiming Qin@qinym710

How can we inject new knowledge into LLMs without full retraining, forgetting, or breaking past edits? We introduce MEMOIR 📖— a scalable framework for lifelong model editing that reliably rewrites thousands of facts sequentially using a residual memory module. 🔥 🧵1/7

English

0

4

17

1K

Dimitriadis Nikos retweetledi

Yiming Qin@qinym710·13 Haz

How can we inject new knowledge into LLMs without full retraining, forgetting, or breaking past edits? We introduce MEMOIR 📖— a scalable framework for lifelong model editing that reliably rewrites thousands of facts sequentially using a residual memory module. 🔥 🧵1/7

English

26

169

1.2K

110.1K

Dimitriadis Nikos retweetledi

Yiming Qin@qinym710·2 May

Happy to share that DeFoG: Discrete Flow Matching for Graph Generation will be showcased as a Spotlight Poster at #ICML2025 ! -> Explore the paper: arxiv.org/abs/2410.04263 -> Open-source code: github.com/manuelmlmadeir… Looking forward to your feedback on our repository!

Manuel Madeira@manuelmlmadeira

Are you interested in graph generation, from molecular discovery 🧪 to social networks 🌐? You’ll love DeFoG 🌬️😶‍🌫️, our new framework that delivers state-of-the-art performance in diverse graph generation tasks with unmatched efficiency! 🤩 📄: arxiv.org/abs/2410.04263 🧵1/9

English

0

8

34

6.8K

Dimitriadis Nikos retweetledi

Manuel Madeira@manuelmlmadeira·2 May

Very happy to see DeFoG accepted as a spotlight at ICML! We’ve also open-sourced the code. Looking forward to your feedback on our repo! 🌬️😶‍🌫️

Yiming Qin@qinym710

Happy to share that DeFoG: Discrete Flow Matching for Graph Generation will be showcased as a Spotlight Poster at #ICML2025 ! -> Explore the paper: arxiv.org/abs/2410.04263 -> Open-source code: github.com/manuelmlmadeir… Looking forward to your feedback on our repository!

English

1

2

18

1.4K

Dimitriadis Nikos@nikdimitriadis·22 Nis

Going to Singapore for ICLR 🇸🇬 Happy to meet people and discuss anything related to LLMs, model merging, multi-task and continual learning! DM me if you want to chat! We will also be be presenting two papers: 1. LiNeS x.com/nikdimitriadis… 2. PaLoRA arxiv.org/abs/2407.08056

Dimitriadis Nikos@nikdimitriadis

Fine-tuning pre-trained models leads to catastrophic forgetting, gains on one task cause losses on others. These issues worsen in multi-task merging scenarios. Enter LiNeS 📈, a method to solve them with ease. 🔥 🌐: lines-merging.github.io 📜: arxiv.org/abs/2410.17146 🧵 1/11

English

0

3

8

834

Dimitriadis Nikos retweetledi

Vishaal Udandarao@vishaal_urao·12 Ara

🚀New Paper arxiv.org/abs/2412.06712 Model merging is the rage these days: simply fine-tune multiple task-specific models and merge them at the end. Guaranteed perf boost! But wait, what if you get new tasks over time, sequentially? How to merge your models over time? 🧵👇

English

1

37

203

18.5K

Dimitriadis Nikos retweetledi

Manuel Madeira@manuelmlmadeira·11 Ara

Good morning NeurIPS! Excited to present "Generative Modelling of Structurally Constrained Graphs" with Clément Vignac, @DorinaThanou , and @pafrossard. Come find me at poster #2704 in East Exhibit Hall A-C from 16:30-19:30 if you're interested in constrained graph diffusion!

English

0

4

14

1.3K

Dimitriadis Nikos@nikdimitriadis·26 Eki

@yhai_models It can be used for OOD generalization, merging multiple models from different tasks or different checkpoints for the same task and even merging policies fine-tuned with different rewards via RLHF. We have experiments for all these cases, some are in vision and some in NLP!

English

0

37

Dimitriadis Nikos@nikdimitriadis·25 Eki

Fine-tuning pre-trained models leads to catastrophic forgetting, gains on one task cause losses on others. These issues worsen in multi-task merging scenarios. Enter LiNeS 📈, a method to solve them with ease. 🔥 🌐: lines-merging.github.io 📜: arxiv.org/abs/2410.17146 🧵 1/11

English

6

57

254

31K

Dimitriadis Nikos@nikdimitriadis·26 Eki

@RaphLeclercAI @francoisfleuret @ArnaudPannatier @micheli_vincent @DanielePaliotta @EloiAlonso1 @balintmt @atulku @MatPagliardini @YoussefMMSaied @stephliemnguyen @benjam1rio @noctrog @clemtarge Glad you like it! It was a great collaboration with @wangkeml @alesfav @gortizji @francoisfleuret @pafrossard !

English

0

154

François Fleuret@francoisfleuret·26 Eki

I happen to supervise cool students mostly. @ArnaudPannatier @micheli_vincent @DanielePaliotta @EloiAlonso1 @balintmt @atulku @nikdimitriadis @MatPagliardini @YoussefMMSaied @stephliemnguyen @benjam1rio @noctrog @clemtarge

Neil@Neiluss_

@francoisfleuret I am just starting to work on Task Arithmetic for multi-task meta learning and this paper is a huge step forward! I'll certainly use LiNeS. Btw, why is all your research so cool? Really, cool is the word.

English

2

1

23

4.5K

Dimitriadis Nikos@nikdimitriadis·26 Eki

@PandaAshwinee Thank you! This does look interesting! We also used sparse masks in our recent work

Dimitriadis Nikos@nikdimitriadis

Wouldn't it be great if we could merge the knowledge of 20 specialist models into a single one without losing performance? 💪🏻 Introducing our new ICML paper "Localizing Task Information for Improved Model Merging and Compression". 🎉 📜: arxiv.org/pdf/2405.07813 🧵1/9

English

0

96

Ashwinee Panda@PandaAshwinee·25 Eki

@nikdimitriadis cool! you may find our prior work interesting x.com/pandaashwinee/…

Ashwinee Panda@PandaAshwinee

Excited to share Lottery Ticket Adaptation (LoTA)! We propose a sparse adaptation method that finetunes only a sparse subset of the weights. LoTA mitigates catastrophic forgetting and enables model merging by breaking the destructive interference between tasks. 🧵👇

English

1

0

3

259

Dimitriadis Nikos@nikdimitriadis·26 Eki

@harambe_musk Thank you! Of course there is still a lot of room for improvement but we think that LiNeS is a step towards this direction!

English

1

0

2

66

harambe_musk🍌@harambe_musk·25 Eki

So we pretty much solved the issue where models would forget general knowledge when fine-tuned for a specific task and even improve performance in 'out of distribution' tasks. Well done! LiNeS was much needed!

Dimitriadis Nikos@nikdimitriadis

Fine-tuning pre-trained models leads to catastrophic forgetting, gains on one task cause losses on others. These issues worsen in multi-task merging scenarios. Enter LiNeS 📈, a method to solve them with ease. 🔥 🌐: lines-merging.github.io 📜: arxiv.org/abs/2410.17146 🧵 1/11

English

1

8

657

Dimitriadis Nikos@nikdimitriadis·25 Eki

Excited to share this work with @wangkeml, @alesfav, @gortizji, @francoisfleuret,@pafrossard! Learn more in our paper and check out the code here: 📜: arxiv.org/abs/2410.17146 💻: github.com/wang-kee/LiNeS 🧵 11/11

English

1

12

6.4K

Dimitriadis Nikos@nikdimitriadis·25 Eki

RLHF policy merging: LiNeS helps combine policies fine-tuned with different rewards in RLHF, improving generalization across reward functions and creating Pareto-dominating solutions. 🏆🤖 🧵 10/11

English

1

0

3

570

Dimitriadis Nikos@nikdimitriadis·25 Eki

How does it work? LiNeS scales parameter updates with a linear schedule across layers: shallow layers stay close to pre-trained values for generalization 🌎, while deeper ones specialize in the task. 🎯 This scaling prevents forgetting while maintaining task information. 🧵 3/11

English

1

0

3

823

Dimitriadis Nikos

Keşfet