Discrete Diffusion Reading Group

7

38

8.9K

Discrete Diffusion Reading Group@diffusion_llms·11h

Meeting link: teams.live.com/meet/935657993…

English

89

Discrete Diffusion Reading Group@diffusion_llms·11h

📢 May 18 (Mon): IDLM: Inverse-distilled Diffusion Language Models 🤔Diffusion Language Models (DLMs) have recently achieved strong results in text generation. However, their multi-step sampling leads to slow inference, limiting practical use. 💡To address this, the authors extend Inverse Distillation, a technique originally developed to accelerate continuous diffusion models, to the discrete setting. However, this extension introduces both theoretical and practical challenges. 🔧To overcome these challenges, the authors first provide a theoretical result demonstrating that their inverse formulation admits a unique solution, thereby ensuring valid optimization. They then introduce gradient-stable relaxations to support effective training. 📊As a result, experiments on multiple DLMs show that their method, Inverse-distilled Diffusion Language Models (IDLM), reduces the number of inference steps by 4×—64×, while preserving the teacher model’s entropy and generative perplexity. This Monday, David Li (scholar.google.com/citations?user…) and Nikita Gushchin (scholar.google.com/citations?user…) will present their jointly led paper, which was recently accepted at ICML 2026. Collaborators of this work include: Dmitry Abulkhanov (@dabulkhanov_), Eric Moulines (scholar.google.com/citations?user…), Ivan Oseledets (@oseledetsivan), Maxim Panov (@maxim_panov), Alexander Korotin (akorotin.netlify.app) Paper link: arxiv.org/abs/2602.19066

English

5

17

1.8K

Discrete Diffusion Reading Group retweetledi

Zhihan Yang@zhihanyang_·11h

Join our reading group next Monday! Paper: IDLM: Inverse-distilled Diffusion Language Models Presenters: David Li, Nikita Gushchin

📢 May 18 (Mon): IDLM: Inverse-distilled Diffusion Language Models 🤔Diffusion Language Models (DLMs) have recently achieved strong results in text generation. However, their multi-step sampling leads to slow inference, limiting practical use. 💡To address this, the authors extend Inverse Distillation, a technique originally developed to accelerate continuous diffusion models, to the discrete setting. However, this extension introduces both theoretical and practical challenges. 🔧To overcome these challenges, the authors first provide a theoretical result demonstrating that their inverse formulation admits a unique solution, thereby ensuring valid optimization. They then introduce gradient-stable relaxations to support effective training. 📊As a result, experiments on multiple DLMs show that their method, Inverse-distilled Diffusion Language Models (IDLM), reduces the number of inference steps by 4×—64×, while preserving the teacher model’s entropy and generative perplexity. This Monday, David Li (scholar.google.com/citations?user…) and Nikita Gushchin (scholar.google.com/citations?user…) will present their jointly led paper, which was recently accepted at ICML 2026. Collaborators of this work include: Dmitry Abulkhanov (@dabulkhanov_), Eric Moulines (scholar.google.com/citations?user…), Ivan Oseledets (@oseledetsivan), Maxim Panov (@maxim_panov), Alexander Korotin (akorotin.netlify.app) Paper link: arxiv.org/abs/2602.19066

English

2

9

775

Discrete Diffusion Reading Group@diffusion_llms·6d

Meeting link: teams.live.com/meet/935657993…

English

337

Discrete Diffusion Reading Group@diffusion_llms·6d

📢 May 11 (Mon): Unifying Masked Diffusion Models with Various Generation Orders and Beyond 🤔AR generates left-to-right; masked diffusion generates in any order; and block diffusion generates block-wise left-to-right, with random order within each block. Can we unify all these frameworks and further learn the generation order jointly with token prediction? 💡The authors propose OeMDM, a unified masked diffusion framework that can express various generation orders, and LoMDM, which jointly learns the generation order and the diffusion model. 🔍Everything comes down to the scheduler: by making the forward and reverse schedulers maximally flexible, it becomes possible to describe all generation orders, even learnable generation orders, within the masked diffusion framework. 📈LoMDM achieves SOTA among discrete diffusion models across all benchmarks, and even outperforms block diffusion models, which strongly benefit from left-to-right bias! This Monday, Chunsan Hong (@ChunsanHong) will present his paper, which received Spotlight at ICML 2026. Collaborators of this work include: Sanghyun Lee, Jong Chul Ye (bispl.weebly.com/professor.html) Paper link: arxiv.org/abs/2602.02112

English

5

23

4.6K

Discrete Diffusion Reading Group retweetledi

Zhihan Yang@zhihanyang_·6d

Join our reading group next Monday! Paper: Unifying Masked Diffusion Models with Various Generation Orders and Beyond Presenter: Chunsan Hong (@ChunsanHong)

📢 May 11 (Mon): Unifying Masked Diffusion Models with Various Generation Orders and Beyond 🤔AR generates left-to-right; masked diffusion generates in any order; and block diffusion generates block-wise left-to-right, with random order within each block. Can we unify all these frameworks and further learn the generation order jointly with token prediction? 💡The authors propose OeMDM, a unified masked diffusion framework that can express various generation orders, and LoMDM, which jointly learns the generation order and the diffusion model. 🔍Everything comes down to the scheduler: by making the forward and reverse schedulers maximally flexible, it becomes possible to describe all generation orders, even learnable generation orders, within the masked diffusion framework. 📈LoMDM achieves SOTA among discrete diffusion models across all benchmarks, and even outperforms block diffusion models, which strongly benefit from left-to-right bias! This Monday, Chunsan Hong (@ChunsanHong) will present his paper, which received Spotlight at ICML 2026. Collaborators of this work include: Sanghyun Lee, Jong Chul Ye (bispl.weebly.com/professor.html) Paper link: arxiv.org/abs/2602.02112

English

2

13

2.7K

Discrete Diffusion Reading Group retweetledi

Zhihan Yang@zhihanyang_·30 Nis

📑Esoteric Language Models: An Any-Order Diffusion Language Model Has been accepted at ICML 2026! See you in Seoul!

Zhihan Yang@zhihanyang_

📢Thrilled to share our new paper: Esoteric Language Models (Eso-LMs) > 🔀Fuses autoregressive (AR) and masked diffusion (MDM) paradigms > 🚀First to unlock KV caching for MDMs (65x speedup!) > 🥇Sets new SOTA on generation speed-vs-quality Pareto frontier How? Dive in👇 [🧵1/13] 📜Paper: arxiv.org/abs/2506.01928 📘Blog: s-sahoo.com/Eso-LMs/ 💻Code: github.com/s-sahoo/Eso-LMs Project co-led with @ssahoo_

English

Esoteric Language Models 🔥Beats MDLM on the speed-quality Pareto frontier 🔥Exact KV Caching 🔥 Exact Likelihood Computation 🔖 arxiv.org/abs/2506.01928 🖥️s-sahoo.com/Eso-LMs/ x.com/ssahoo_/status…

9

45

5.1K

Discrete Diffusion Reading Group retweetledi

Subham Sahoo@ssahoo_·26 Nis

Giving a talk on Eso-LMs at the MM Intelligence Workshop ⏲️ Saturday, 10:45 - 11 am 📍204 C Summary: MDLMs with Causal Attention support: > Exact KV caching (unlike block diffusion) > Single Pass NELBO estimation => Improved RL post-training > Exact Likelihood Computation!!!

Subham Sahoo@ssahoo_

English

4

14

1.9K

Discrete Diffusion Reading Group@diffusion_llms·25 Nis

Thank you all for coming ❤️

✈️Discrete Diffusion Meetup @iclr_conf 📅 April 24, 4 pm 📍 RioCentro (TBD; In the comments) If you’re into discrete diffusion, come hang out, talk shop, and meet others working in the space. hosts: @ssahoo_ @jdeschena

English

17

2.5K

Discrete Diffusion Reading Group retweetledi

Subham Sahoo@ssahoo_·24 Nis

✈️ Discrete Diffusion Meetup @iclr_conf 📅 Today, April 24 (Fri) | 4PM We meet at this location in the middle of the conference center and talk shop!

Subham Sahoo@ssahoo_

✈️ Discrete Diffusion Meetup @iclr_conf 📅 RioCentro | April 24 (Thurs) | 4PM I’ll share the exact location in the comments as we get closer. Save this post so you don’t miss the update.

English

5

25

3K

Discrete Diffusion Reading Group@diffusion_llms·23 Nis

Let's meet in the garden in the middle of the conference center, near the white structure and under the trees 🚀

English

197

Discrete Diffusion Reading Group retweetledi

Discrete Diffusion Reading Group@diffusion_llms·21 Nis

✈️Discrete Diffusion Meetup @iclr_conf 📅 April 24, 4 pm 📍 RioCentro (TBD; In the comments) If you’re into discrete diffusion, come hang out, talk shop, and meet others working in the space. hosts: @ssahoo_ @jdeschena

Subham Sahoo@ssahoo_

Thank you all for coming out to the discrete diffusion meetup. Turnout was over a 100 people😊

English

6

34

8.8K

Discrete Diffusion Reading Group retweetledi

Justin Deschenaux@jdeschena·23 Nis

Let's meet to talk about diffusion LLMs :)

The location was selected! Let's meet in the garden in the middle of the conference center, near the white structure and under the trees 🚀

English

4

16

1.5K

Discrete Diffusion Reading Group@diffusion_llms·23 Nis

The location was selected! Let's meet in the garden in the middle of the conference center, near the white structure and under the trees 🚀

✈️Discrete Diffusion Meetup @iclr_conf 📅 April 24, 4 pm 📍 RioCentro (TBD; In the comments) If you’re into discrete diffusion, come hang out, talk shop, and meet others working in the space. hosts: @ssahoo_ @jdeschena

English

2

3

2K

Discrete Diffusion Reading Group@diffusion_llms·20 Nis

📢Missed the talk? Make sure to check the recording on YouTube: youtu.be/MBgNIZFnKjQ

YouTube

📢 April 20 (Mon): Planner Aware Path Learning in Diffusion Language Models Training 🤔A key limitation of diffusion language models is that they are usually trained without accounting for the planner that guides decoding at inference time. 💡Planner Aware Path Learning (PAPL) brings the planning process into the training objective so that learning better matches how generation is actually performed. 🔍By viewing decoding as a coupled planner–denoiser process, PAPL provides a more principled training framework for diffusion language models. 📈Across experiments, this leads to improved sequence generation quality over simple training objectives (2 lines of code change) . This Monday, Fred Zhangzhi Peng (@pengzhangzhi1) and Zachary Bezemek (@bezemekz) will present their jointly led paper, which received Oral at ICLR 2026! Collaborators: Jarrid Rector-Brooks (@jarridrb), Shuibai Zhang (@ShuibaiZ69721), Anru R. Zhang (anruzhang.github.io), Michael Bronstein (@mmbronstein), Alexander Tong (@AlexanderTong7), and Avishek Joey Bose (@bose_joey). Paper link: arxiv.org/abs/2509.23405

English

4

12

1.8K

Discrete Diffusion Reading Group@diffusion_llms·20 Nis

Happening in 15 minutes!

📢 April 20 (Mon): Planner Aware Path Learning in Diffusion Language Models Training 🤔A key limitation of diffusion language models is that they are usually trained without accounting for the planner that guides decoding at inference time. 💡Planner Aware Path Learning (PAPL) brings the planning process into the training objective so that learning better matches how generation is actually performed. 🔍By viewing decoding as a coupled planner–denoiser process, PAPL provides a more principled training framework for diffusion language models. 📈Across experiments, this leads to improved sequence generation quality over simple training objectives (2 lines of code change) . This Monday, Fred Zhangzhi Peng (@pengzhangzhi1) and Zachary Bezemek (@bezemekz) will present their jointly led paper, which received Oral at ICLR 2026! Collaborators: Jarrid Rector-Brooks (@jarridrb), Shuibai Zhang (@ShuibaiZ69721), Anru R. Zhang (anruzhang.github.io), Michael Bronstein (@mmbronstein), Alexander Tong (@AlexanderTong7), and Avishek Joey Bose (@bose_joey). Paper link: arxiv.org/abs/2509.23405

English

1

476

Discrete Diffusion Reading Group retweetledi

Justin Deschenaux@jdeschena·20 Nis

🇧🇷 I'll be in Rio de Janeiro for ICLR from tomorrow, where I will present 4 of our recent works on diffusion language models, including PGM (oral) and BlockGen (workshop spotlight talk). I'll be happy to meet and catch up in person, please reach out if you'd like to chat :)

English

7

42

3.3K

Discrete Diffusion Reading Group retweetledi

Subham Sahoo@ssahoo_·19 Nis

Attending @iclr_conf to present the following papers-- feel free to reach out if you’d like to schedule a 1:1. > The Diffusion Duality, Chapter 2: Psi Samplers 📅 Friday, Apr 24 🕦 10:30 - 1 pm 📍 Pavilion 3 P3-#723 > Esoteric Language Models Oral at MM Intelligence workshop. 📅 Sunday, Apr 26 🕦 10:45 - 11:00 am 📍 Room 204 C

English