Fartash Faghri (@FartashFg) - Twitter Profili | Zamantika Mersobahis Locabet

Fartash Faghri retweetledi

Continual Learning remains one of the most challenging “holy grails” of AI. Most discussions focus on catastrophic forgetting: models lose what they previously learned. But there is another equally important failure mode: over long continual training, neural networks can also lose their plasticity, ie, their ability to learn new things is weakened over time. In our ICLR 2026 work with colleagues at @Apple and @ETH, we study this phenomenon, known as Loss of Plasticity (LoP), from a geometric perspective. We show that LoP can arise when gradient dynamics become trapped in invariant manifolds of parameter space. In particular, we analyze two types of traps: 🔴 Frozen units: units saturate, gradients vanish, and they become effectively silent to backpropagation. 🔵 Cloned units: units become redundant, receive matching forward and backward signals, and move together. For these structures, the gradient is tangent to the trap. Once standard GD/SGD enters these affine subspaces, it cannot leave them on its own. This means the dynamics can remain sticky even when the data distribution or task changes. What we find especially interesting is that these traps are not merely optimization bugs. The same feature-learning pressures that help networks learn useful representations for the current task can also push them toward states with less future adaptability. This raises a difficult open question for future work: are neural networks trained with SGD and cross-entropy loss fundamentally the right framework for continual learning? Please read the full paper for more details: arxiv.org/pdf/2510.00304

Amir Joudaki@AmirJoudaki

Neural nets don’t just forget. Sometimes, after long training, they lose the ability to learn at all. In our #ICLR2026 poster, we model Loss of Plasticity as gradient dynamics trapped in invariant manifolds: 🔴 frozen units, 🔵 cloned units. The video makes the traps visible.

English

9

47

360

49.6K

Fartash Faghri@FartashFg·27 Nis

In the era of continued pretraining and continued fine-tuning, loss of plasticity means leaving future gains on the table. We need a better theoretical understanding of loss of plasticity. See a great thread unpacking the dynamics. 👇 #ICLR2026 #ContinualLearning #DeepLearning

Amir Joudaki@AmirJoudaki

Neural nets don’t just forget. Sometimes, after long training, they lose the ability to learn at all. In our #ICLR2026 poster, we model Loss of Plasticity as gradient dynamics trapped in invariant manifolds: 🔴 frozen units, 🔵 cloned units. The video makes the traps visible.

English

0

4

11

2K

Fartash Faghri retweetledi

Amir Joudaki@AmirJoudaki·25 Nis

Neural nets don’t just forget. Sometimes, after long training, they lose the ability to learn at all. In our #ICLR2026 poster, we model Loss of Plasticity as gradient dynamics trapped in invariant manifolds: 🔴 frozen units, 🔵 cloned units. The video makes the traps visible.

English

16

52

615

99.3K

Fartash Faghri@FartashFg·23 Nis

Attending #ICLR2026! Feel free to message me to chat about efficient multimodal models, or come find me at: 🗣️ Chairing Oral Session 4C: Vision Language Models III Fri 24 Apr | 3:15 PM - 4:45 PM 📊 MobileCLIP2 (DFNDR 2B/12M released, links below👇) Sat 25 Apr | 10:30 AM - 1:00 PM | Poster Session 5, Pavilion 4, #3713 🪧 Apple Booth, Sat 25 Apr | 1:30 PM - 3:30 PM 📊Barriers for Learning in an Evolving World: Mathematical Understanding of Loss of Plasticity Sat 25 Apr | 3:15 PM - 5:45 PM | Poster Session 6, Pavilion 4, #4202 📊 Data-Centric Lessons To Improve Speech-Language Pretraining Sat 25 Apr | 3:15 PM - 5:45 PM | Poster Session 6, Pavilion 3, #1418 Come work with our MLR team on efficient ML (message me if interested!): jobs.apple.com/en-us/details/… Apple @ ICLR 2026: machinelearning.apple.com/updates/apple-… DFNDR-2B: huggingface.co/datasets/apple… DFNDR-12M: huggingface.co/datasets/apple… huggingface.co/datasets/apple…

English

0

8

388

Fartash Faghri retweetledi

Oncel Tuzel@OncelTuzel·12 Mar

LiTo: Surface Light Field Tokenization (ICLR 2026) — new work from Apple MLR. LiTo learns a unified 3D representation of geometry + view-dependent appearance, capturing effects like specular highlights & Fresnel reflections, enabling high-fidelity 3D generation from single image.

English

1

28

203

12.3K

Fartash Faghri@FartashFg·6 Ara

Join us tomorrow for CCFM workshop, Upper Level Room 25ABC, Sunday, December 7th 2025, 8am-5pm. neurips.cc/virtual/2025/l…

Fartash Faghri@FartashFg

Is your AI keeping Up with the world? Announcing #NeurIPS2025 CCFM Workshop: Continual and Compatible Foundation Model Updates When/Where: Dec. 6-7 San Diego Submission deadline: Aug. 22, 2025. (opening soon!) sites.google.com/view/ccfm-neur… #FoundationModels #ContinualLearning

English

0

4

425

Fartash Faghri@FartashFg·1 Ara

Excited to attend #NeurIPS in San Diego this week. CCFM workshop will be held on Sunday, Dec. 7, 8am-5pm at Upper Level Room 25ABC. We are thrilled to have an exceptional lineup of speakers and paper contributions. CCFM: sites.google.com/view/ccfm-neur… #Apple at #NeurIPS: machinelearning.apple.com/research/neuri… machinelearning.apple.com/updates/apple-…

English

0

1

8

713

Fartash Faghri retweetledi

Oncel Tuzel@OncelTuzel·20 Kas

Come work with us! The Machine Learning Research (MLR) team at Apple is seeking a passionate AI researcher to work on Efficient ML algorithms, including models optimized for fast inference and efficient training methods. Apply here: jobs.apple.com/en-us/details/…

English

6

41

368

32.5K

Fartash Faghri retweetledi

Vishaal Udandarao@vishaal_urao·30 Eki

🚀New Paper arxiv.org/abs/2510.20860 We conduct a systematic data-centric study for speech-language pretraining, to improve end-to-end spoken-QA! 🎙️🤖 Using our data-centric insights, we pretrain a 3.8B SpeechLM (called SpeLangy) outperforming 3x larger models! 🧵👇

English

3

40

127

9.7K

Fartash Faghri@FartashFg·15 Eki

@justachetan The email issue is fixed now. The address is the same. Thanks for letting us know.

English

0

1

283

Aditya Chetan@justachetan·15 Eki

@FartashFg Hi Fartash, I am interested in applying for this role; however, it seems that the email ID shared has a typo. I got a mail delivery failure. Could you kindly share the correct email ID? Thanks!

English

1

0

4

834

Fartash Faghri@FartashFg·15 Eki

📣 Internship at Apple ML Research We’re looking for a PhD research intern with interests in efficient multimodal models and video. For our recent works see machinelearning.apple.com/research/fast-… This is a pure-research internship where the objective is to publish high-quality work. Internship duration is 4-10 months between November 2025-September 2026 If you are interested email your resume to mlr-efficient-ml-internship@group.apple.com and apply to jobs.apple.com/en-us/details/…

English

3

29

295

19.4K

Fartash Faghri@FartashFg·14 Eki

🚨While booking your travel for #NeurIPS2025, make sure to stay on Sunday, December 7 8am-5pm for CCFM Workshop (Continual and Compatible Foundation Model Updates). We have received exciting paper contributions and have an amazing lineup of speakers.

Fartash Faghri@FartashFg

Is your AI keeping Up with the world? Announcing #NeurIPS2025 CCFM Workshop: Continual and Compatible Foundation Model Updates When/Where: Dec. 6-7 San Diego Submission deadline: Aug. 22, 2025. (opening soon!) sites.google.com/view/ccfm-neur… #FoundationModels #ContinualLearning

English

0

3

21

3.9K

Fartash Faghri retweetledi

Hadi Pouransari@HPouransari·13 Eki

📣We have PhD research internship positions available at Apple MLR. DM me your brief research background, resume, and availability (earliest start date and latest end date) if interested in the topics below.

Hadi Pouransari@HPouransari

Introducing Pretraining with Hierarchical Memories: Separating Knowledge & Reasoning for On-Device LLM Deployment 💡We propose dividing LLM parameters into 1) anchor (always used, capturing commonsense) and 2) memory bank (selected per query, capturing world knowledge). [1/X]🧵

English

7

46

457

55.9K

Fartash Faghri retweetledi

Hadi Pouransari@HPouransari·6 Eki

Introducing Pretraining with Hierarchical Memories: Separating Knowledge & Reasoning for On-Device LLM Deployment 💡We propose dividing LLM parameters into 1) anchor (always used, capturing commonsense) and 2) memory bank (selected per query, capturing world knowledge). [1/X]🧵

GIF

English

11

112

634

170.3K

Fartash Faghri retweetledi

Yuyang Wang @ICLR26@YuyangW95·24 Eyl

New preprint & open-source! 🚨 “SimpleFold: Folding Proteins is Simpler than You Think” (arxiv.org/abs/2509.18480). We ask: Do protein folding models really need expensive and domain-specific modules like pair representation? We build SimpleFold, a 3B scalable folding model solely built on general-purpose transformers + flow matching, and is trained on 9M structures. SimpleFold supports easy deployment and efficient inference on consumer-level hardware with PyTorch/MLX (try it on your MacBook!) (1/n)

English

12

87

353

104.9K

Fartash Faghri@FartashFg·1 Eyl

🚨TODAY (September 1 AOE) is the deadline for #NeurIPS 2025 CCFM Workshop. Get your papers in!

Fartash Faghri@FartashFg

Is your AI keeping Up with the world? Announcing #NeurIPS2025 CCFM Workshop: Continual and Compatible Foundation Model Updates When/Where: Dec. 6-7 San Diego Submission deadline: Aug. 22, 2025. (opening soon!) sites.google.com/view/ccfm-neur… #FoundationModels #ContinualLearning

English

0

2

1.2K

Fartash Faghri

Keşfet