Neta Shaul

37 posts

Neta Shaul

Neta Shaul

@shaulneta

PhD Student at @WeizmannScience

Katılım Haziran 2023
55 Takip Edilen361 Takipçiler
Neta Shaul
Neta Shaul@shaulneta·
After multiple requests for the code of the visuals from my talk about Transition Matching, I made a notebook that reproduces the DTM vs. FM GIF! This demo is a good way to build intuition on how TM and FM differ. github.com/neta93/visual-… @urielsinger
English
0
2
15
1.2K
Neta Shaul
Neta Shaul@shaulneta·
Incredible atmosphere at the poster session today! Thanks to everyone who visited 🙌 I’ll be at #NeurIPS2025 until Dec 8. If you missed the poster or just want to chat, my DMs are open. Shoutout to @EliahuHorwitz for the pictures!
Neta Shaul tweet media
Neta Shaul@shaulneta

#NeurIPS2025 "Transition Matching" explores generative Markov processes with expressive transition kernels, going beyond the Gaussian kernel used in diffusion and flow models. Interested? Let's chat! 📍 Poster #3609 🕒 Wed at 11am - 2pm 📄 arxiv.org/abs/2506.23589

English
1
4
27
3.8K
Neta Shaul retweetledi
Heli Ben-Hamu
Heli Ben-Hamu@helibenhamu·
I'll be at NeurIPS on Dec 3-4. Would be happy to meet up and chat about efficient sampling methods from language models ⚡️ Or, catch me at our EB-Sampler poster on Thursday 4:30pm Joint work with @itai_gat, @_dsevero, Niklas Nolte, Brian Karrer
Heli Ben-Hamu tweet media
English
0
8
39
3K
Neta Shaul
Neta Shaul@shaulneta·
#NeurIPS2025 "Transition Matching" explores generative Markov processes with expressive transition kernels, going beyond the Gaussian kernel used in diffusion and flow models. Interested? Let's chat! 📍 Poster #3609 🕒 Wed at 11am - 2pm 📄 arxiv.org/abs/2506.23589
Neta Shaul tweet media
English
3
5
26
5.3K
Neta Shaul
Neta Shaul@shaulneta·
Had a blast talking about Transition Matching at the HUJI Vision Seminar, big thanks to @EliahuHorwitz for inviting me! 🚀 If you like simple visual illustrations of complex ideas, I made a few in my slides: neta93.github.io/slides/transit…
English
3
1
23
2.2K
Neta Shaul retweetledi
Heli Ben-Hamu
Heli Ben-Hamu@helibenhamu·
Excited to share our work Set Block Decoding! A new paradigm combining next-token-prediction and masked (or discrete diffusion) models, allowing parallel decoding without any architectural changes and with exact KV cache. Arguably one of the simplest ways to accelerate LLMs!
English
5
24
115
25.7K
Neta Shaul
Neta Shaul@shaulneta·
[1/n] New paper alert! 🚀 Excited to introduce 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐓𝐌)! We're replacing short-timestep kernels from Flow Matching/Diffusion with... a generative model🤯, achieving SOTA text-2-image generation! @urielsinger @itai_gat @lipmanya
GIF
English
5
46
289
85.8K
Neta Shaul
Neta Shaul@shaulneta·
@Fate_10kokoro @Fate_10kokoro here’s the r.v DTM samples x.com/shaulneta/stat…. informally: if k is large and k/T small, then repeatedly sampling the transition kernel k times with step size 1/T is roughly an Euler step of size k/T with the kernel’s expectation as velocity.
Neta Shaul@shaulneta

If you're curious to dive deeper into Transition Matching (TM)✨🔍, a great starting point is understanding the similarities and differences between 𝐃𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐜𝐞 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐃𝐓𝐌) and Flow Matching (FM)💡.

English
0
0
3
200
Gaopeng Ren
Gaopeng Ren@Fate_10kokoro·
@shaulneta does this mean transition matching also learn the distribution of the velocity? and if the number of timesteps is very large, it converges to the expectation value of the velocity, that is, X_T-X_0.
English
1
0
0
231
Neta Shaul
Neta Shaul@shaulneta·
DTM vs FM👇 Lots of interest in how Difference Transition Matching (DTM) connects to Flow Matching (FM). Here is a short animation that illustrates Theorem 1 in our paper: For a very small step size (1/T), DTM converges to an Euler step of FM.
GIF
Neta Shaul@shaulneta

[1/n] New paper alert! 🚀 Excited to introduce 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐓𝐌)! We're replacing short-timestep kernels from Flow Matching/Diffusion with... a generative model🤯, achieving SOTA text-2-image generation! @urielsinger @itai_gat @lipmanya

English
2
46
327
25K
Neta Shaul
Neta Shaul@shaulneta·
@nico_dufour @urielsinger @itai_gat @lipmanya [2/2] Such an approach doesn’t hurt performance—on the contrary, it may offer a path for improvement. We verified that gains aren’t due to “overfitting” the transition kernel with fewer steps.
English
0
0
1
114
Neta Shaul
Neta Shaul@shaulneta·
@nico_dufour @urielsinger @itai_gat @lipmanya [1/2] DTM is intrinsically a discrete-time process: different time discretization➡️different process. However, DTM depends only on the current time, allowing to learn a continuous-time model (parameters are shared between processes) and discretization can be selected at inference
English
1
0
1
129
Neta Shaul
Neta Shaul@shaulneta·
@nico_dufour @urielsinger @itai_gat @lipmanya Thanks Nicolas! FM learns the expected transition, while DTM learns to sample from the full transition distribution (slide). Adding a small MLP to FM didnt help—only when the MLP became a generative model (i.e., DTM) we saw improvements. I'll post more on DTM–FM soon, stay tuned!
Neta Shaul tweet media
English
1
0
4
300
Nicolas DUFOUR
Nicolas DUFOUR@nico_dufour·
@shaulneta @urielsinger @itai_gat @lipmanya Hey nice work! Something i struggle to understand is what part of the improvements come from the framework and which come from the MAR architecture. Have you tried to train DTM without the MAR head with a vanilla DiT? Or is the FM baseline also using MAR? Thanks!
English
1
0
1
367
Neta Shaul
Neta Shaul@shaulneta·
@CSProfKGD I’m glad you take interest in our work Kosta. Mean flows is indeed exciting work! From TM perspective, they learn large step-size transitions with a deterministic kernel which is very interesting. I don't have a more elaborate answer at the moment, but I plan to look into it.
English
0
0
4
144
Neta Shaul
Neta Shaul@shaulneta·
If you're curious to dive deeper into Transition Matching (TM)✨🔍, a great starting point is understanding the similarities and differences between 𝐃𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐜𝐞 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐃𝐓𝐌) and Flow Matching (FM)💡.
Neta Shaul tweet media
Neta Shaul@shaulneta

[1/n] New paper alert! 🚀 Excited to introduce 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐓𝐌)! We're replacing short-timestep kernels from Flow Matching/Diffusion with... a generative model🤯, achieving SOTA text-2-image generation! @urielsinger @itai_gat @lipmanya

English
2
15
125
14.5K
Neta Shaul
Neta Shaul@shaulneta·
@vtaohu Great question! FM learns to approximate the expectation of the transition kernel, whereas DTM learns to sample from the underlying distribution of transitions. Hence, DTM is more expressive. Note, for a very small step size (1/T), FM's approximation is fully expressive!
Neta Shaul tweet media
English
0
0
0
92
Tao HU
Tao HU@vtaohu·
@shaulneta Hi, Neta, could you elaborate why DTM is "a more expressive kernel"? I am confused here. 😀
English
1
0
1
79
Neta Shaul
Neta Shaul@shaulneta·
Difference Transition Matching (DTM) process is so simple to Illustrate, you can calculate it on a whiteboard! At each step: Draw all lines connecting source and target (shaded) ⬇️ List those intersecting with the current state (yellow) ⬇️ Sample a line from the list (green)
GIF
Neta Shaul@shaulneta

[1/n] New paper alert! 🚀 Excited to introduce 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐓𝐌)! We're replacing short-timestep kernels from Flow Matching/Diffusion with... a generative model🤯, achieving SOTA text-2-image generation! @urielsinger @itai_gat @lipmanya

English
2
16
133
10K
Neta Shaul
Neta Shaul@shaulneta·
@thoma_gu @urielsinger @itai_gat @lipmanya [2/2] DART-FM, however, is not an immediate TM variant according to our formulation. It can be seen as learning a non-markov diffusion kernel, and in a second stage it is composed with a small flow matching kernel to improve expressiveness.
English
0
0
3
218
Neta Shaul
Neta Shaul@shaulneta·
@thoma_gu @urielsinger @itai_gat @lipmanya [1/2] Thanks for pointing! Indeed, DART-AR is a variant of TM, so I added a small slide showing the connection. In a nutshell, it uses an independent process but the kernel is Gaussian and hence not fully-expressive. (We will add a reference in the next version)
Neta Shaul tweet media
English
2
0
7
1.4K