Stefano Melacci

23 posts

Stefano Melacci banner
Stefano Melacci

Stefano Melacci

@StefanoMelacci

Out of This World

Katılım Kasım 2011
39 Takip Edilen44 Takipçiler
Vladimir Araujo
Vladimir Araujo@vgaraujov·
The Solution: Enter L2R 🧠 - Adapter Isolation: It trains adapters (small fine-tuning modules) separately for each task to prevent interference. - Smart Routing: L2R learns how to dynamically combine these adapters using a memory of previous tasks before inference. [3/n]
Vladimir Araujo tweet media
English
2
0
0
74
Vladimir Araujo
Vladimir Araujo@vgaraujov·
📢 Camera-ready of my recent paper accepted to #EMNLP2024 @emnlpmeeting: "Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models" arxiv.org/abs/2408.09053 🧵Introducing L2R (Learning to Route) in #ContinualLearning for #NLProc! [1/n]
Vladimir Araujo@vgaraujov

📢 Glad to share two accepted papers at @emnlpmeeting #EMNLP2024 🎉 "Learning to Route for Dynamic Adapter Composition in Lifelong Language Learning" with @tuytelaarslab "Probing the Linguistic and Visual Knowledge of Pixel-based Language Models" with @KushalTatariya @mdlhx

English
1
2
10
420
Stefano Melacci retweetledi
Matteo Tiezzi
Matteo Tiezzi@TiezziMatteo·
#CoLLAs2024 I will present our novel architectural solution for Continual Learning, Memory Head, at the upcoming @CoLLAs_Conf in Pisa! Paper and video: lifelong-ml.cc/Conferences/20… 📝Pass by at our poster on Tuesday 30th and to the oral presentation on Wednesday 31th morning!
Matteo Tiezzi tweet media
English
1
10
40
1.9K
Stefano Melacci retweetledi
Matteo Tiezzi
Matteo Tiezzi@TiezziMatteo·
We summarized the key ingredients underlying the current models generation: *️⃣ linear recurrence *️⃣ element-wise recurrence *️⃣ gating mechanisms and where there might be more room for novel research activities: #onlinelearning arxiv.org/abs/2406.09062
Matteo Tiezzi tweet media
English
1
1
0
72
Stefano Melacci retweetledi
Matteo Tiezzi
Matteo Tiezzi@TiezziMatteo·
Goal: efficiently 𝐩𝐫𝐨𝐜𝐞𝐬𝐬 𝐥𝐨𝐧𝐠 𝐬𝐞𝐪𝐮𝐞𝐧𝐜𝐞𝐬 + 𝐩𝐫𝐞𝐬𝐞𝐫𝐯𝐞 𝐥𝐨𝐧𝐠 𝐭𝐞𝐫𝐦 𝐝𝐞𝐩𝐞𝐧𝐝𝐞𝐧𝐜𝐢𝐞𝐬 TL;DR #Transformers + #RNNs, #DeepStatespace models ➡️ progressive confluence of previously disjoint architectures arxiv.org/abs/2406.09062
Matteo Tiezzi tweet media
English
1
2
0
86
Stefano Melacci retweetledi
Matteo Tiezzi
Matteo Tiezzi@TiezziMatteo·
🧐Interested in the details of the latest SOTA architectures (#RWKV , #RetNet, #Mamba, #Griffin) for long sequences? 📢 "State-space Modeling in Long Sequence Processing: A Survey on Recurrence in the Transformer Era" arxiv.org/abs/2406.09062
English
1
8
12
1.1K
Stefano Melacci retweetledi
CoLLAs 2026
CoLLAs 2026@CoLLAs_Conf·
🚨1 month left until the #CoLLAs2024 paper submission deadline!🚨 Don’t miss your opportunity to contribute to the field of lifelong learning. Finalize your research and submit by 15 Feb 2024! 📄 Prepare your submission: lifelong-ml.cc/Conferences/20… 🗓️ Abstract deadline: 09 Feb 2024
GIF
English
0
7
13
5.8K