Darshan Patil (@dapatil211) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

🧬 New paper Scientific datasets evolve as science evolves. With proteins, new sequences get added, annotations get corrected, and noisy entries get curated out. Introducing CoPeP, a continual-pretraining benchmark for protein LMs. Details 🧵 1/n

English

2

29

84

8.5K

Darshan Patil retweetledi

Nilaksh@nilaksh404·12 May

Diffusion world models can help test and improve robot policies before running them on real robots. But can the choice of latent space make the WM more faithful? We show that semantic spaces beat reconstruction spaces on task relevant metrics. hskalin.github.io/semantic-wm

English

5

48

217

41K

Darshan Patil retweetledi

Martin Mundt@mundt_martin·30 Nis

Our position paper “Modular Memory is the Key to Continual Learning Agents” (arxiv.org/abs/2603.01761) has been accepted to #ICML2026 @icmlconf as a spotlight! 🎊 Read for a modern perspective on memory, continual learning, and sustainable adaptation at foundation model scale!

Martin Mundt@mundt_martin

Following our @dagstuhl seminar on Continual Learning in the Foundation Model Era, we are now sharing a roadmap! tldr: we view modular memory design as the missing piece to combine the capabilities of In-Weight & In-Context Learning for adaptation at scale arxiv.org/abs/2603.01761

English

5

27

165

14.6K

Darshan Patil retweetledi

tom@tvergarabrowne·30 Nis

accepted at ICML! see you in Seoul 🇰🇷

tom@tvergarabrowne

first paper of the phd 🥳 the Superficial Alignment Hypothesis (SAH) argues that pre-training adds most of the knowledge to a model, and post-training merely surfaces it. however, this hypothesis has lacked a precise definition. we fix this.

English

2

4

49

2.1K

Darshan Patil retweetledi

Ekaterina Lobacheva @ ICLR 2026 🇧🇷@KateLobacheva·26 Nis

Merging models of different depths is not harder than merging same-sized ones. Work led by @nour_shaheen_. Poster at TTU Workshop, Mon 11:15

Ekaterina Lobacheva @ ICLR 2026 🇧🇷 tweet media

English

0

3

5

314

Darshan Patil retweetledi

Ekaterina Lobacheva @ ICLR 2026 🇧🇷@KateLobacheva·26 Nis

Data used for model merging affects various parts of the merging procedure differently. Work led by @gauraviyer99. Poster at @scifordl, Sun 16:15

English

1

2

6

353

Darshan Patil retweetledi

Ekaterina Lobacheva @ ICLR 2026 🇧🇷@KateLobacheva·26 Nis

When data distribution shifts, loss should be changed smoothly to preserve useful features. Work led by @dapatil211. Poster at CAO Workshop, Sun 14:45

English

1

2

8

216

Darshan Patil retweetledi

Ekaterina Lobacheva @ ICLR 2026 🇧🇷@KateLobacheva·26 Nis

Per-example gradient alignment shows how the circuits in LLMs are formed. Works led by @mirandrom. Poster at @scifordl, Sun 16:15

English

1

4

5

325

Darshan Patil retweetledi

Ekaterina Lobacheva @ ICLR 2026 🇧🇷@KateLobacheva·26 Nis

LoRA and full fine-tuning use different features even when they give the same quality results. Work led by Jerome Emery. Poster at @scifordl, Sun 11:45

English

1

2

239

Darshan Patil retweetledi

Mehran Shakerinava@MShakerinava·24 Nis

Want to know the expressivity of Mamba 3? Come by our ICLR poster! Sat, Apr 25 • 3:15 PM – 5:45 PM Pavilion 4 P4-#4409 The Expressive Limits of Diagonal SSMs for State-Tracking Joint work with Behnoush Khavari, Siamak Ravanbakhsh, and @apsarathchandar.

English

1

10

21

1.4K

Darshan Patil retweetledi

kevin zhang@kevinghstz·7 Nis

Hierarchical planning unlocks long-horizon, non-greedy behavior in JEPA world models. Paper: arxiv.org/pdf/2604.03208 Website: kevinghst.github.io/HWM/ Code: github.com/kevinghst/HWM_…

English

9

45

257

89.3K

Darshan Patil retweetledi

Ekaterina Lobacheva @ ICLR 2026 🇧🇷@KateLobacheva·31 Mar

Happy to be one of the organizers of the ICML Workshop on Weight-Space Symmetries 🥳 Submit your work by April 24! #weightsymmetry2026 #ICML2026

Weight Space Symmetries @ ICML 2026@weightsymmetry

📢Excited to announce the Workshop on Weight-Space Symmetries @icmlconf! We welcome 4-page submissions analysing symmetries, their effects on training and model structure, and practical methods to utilize them. Submission Deadline: April 24 (23:59 AoE) #ICML2026

English

0

2

18

2.8K

Darshan Patil retweetledi

Alex Weers@a_weers·15 Mar

Finally finished! If you're interested in an overview of recent methods in reinforcement learning for reasoning LLMs, check out this blog post: aweers.de/blog/2026/rl-f… It summarizes ten methods, tries to highlight differences and trends, and has a collection of open problems

English

21

247

1.8K

322.7K

Darshan Patil retweetledi

Vaibhav Adlakha@vaibhav_adlakha·12 Mar

Your LLM already knows the answer. Why is your embedding model still encoding the question? 🚨Introducing LLM2Vec-Gen: your frozen LLM generates the answer's embedding in a single forward pass — without ever generating the answer. Not only that, the frozen LLM can decode the embedding back into text. 🏆 SOTA self-supervised embeddings 🛡️ Free transfer of instruction-following, safety, and reasoning

GIF

English

5

37

193

50.4K

Darshan Patil retweetledi

Martin Mundt@mundt_martin·4 Mar

Following our @dagstuhl seminar on Continual Learning in the Foundation Model Era, we are now sharing a roadmap! tldr: we view modular memory design as the missing piece to combine the capabilities of In-Weight & In-Context Learning for adaptation at scale arxiv.org/abs/2603.01761

English

1

7

38

17.5K

Darshan Patil retweetledi

Rahaf Aljundi@AljundiRahaf·4 Mar

This fall, during a Dagstuhl seminar on continual learning, we discussed with various researchers from the field the roadmap for continual learning. We converged to one view: modular memory is the key to continual learning agents, as outlined in here arxiv.org/pdf/2603.01761

English

0

6

15

1K

Darshan Patil@dapatil211·5 Mar

I couldn't have done this work without my amazing collaborators: @pranshumalviya8, Mathieu Reymond, @qfournier2, and @apsarathchandar

Darshan Patil@dapatil211

🧬 New paper Scientific datasets evolve as science evolves. With proteins, new sequences get added, annotations get corrected, and noisy entries get curated out. Introducing CoPeP, a continual-pretraining benchmark for protein LMs. Details 🧵 1/n

English

0

9

288

Darshan Patil@dapatil211·5 Mar

If you’re building CL methods for foundation models, CoPeP is a realistic testbed where the world changes underneath you. Paper: arxiv.org/abs/2603.00253 Models/data: huggingface.co/collections/ch… n/n

English

1

2

244

Darshan Patil@dapatil211·5 Mar

Same story on downstream protein understanding: different methods have the best win rates on PEER vs DGEB. Continual pretraining is multi-objective, and CoPeP makes the trade-offs measurable. 10/n

English

1

0

1

151

Darshan Patil@dapatil211·5 Mar

🧬 New paper Scientific datasets evolve as science evolves. With proteins, new sequences get added, annotations get corrected, and noisy entries get curated out. Introducing CoPeP, a continual-pretraining benchmark for protein LMs. Details 🧵 1/n

English

2

29

84

8.5K

Darshan Patil

Keşfet