Daniil Tiapkin

23 posts

Daniil Tiapkin

@dtiapkin

Research Scientist @ Google DeepMind | PhD in RL 🇫🇷

Paris, France Katılım Haziran 2022

157 Takip Edilen182 Takipçiler

Daniil Tiapkin@dtiapkin·12 Ara

@josephdviviano Yes, sure, for the next release we'll compare with this implementation, thanks a lot for a ref!

English

Joseph Viviano@josephdviviano·4 Ara

@dtiapkin e.g., #L119" target="_blank" rel="nofollow noopener">github.com/tristandeleu/j…

Daniil Tiapkin@dtiapkin·21 Kas

While frontier labs are announcing their new models, we also want to be part of this parade. So, we’re happy to announce gfnx – a JAX-first library with environments and a single-file baseline implementation for GFlowNet research.

English

1.2K

Daniil Tiapkin retweetledi

Viacheslav Meshchaninov@Viacheslav91112·29 Kas

🚀COSMOS is OUT! @NeurIPSconf 2025! 📈COSMOS achieves up to 2x faster text generation compared to other diffusion models, utilizing up to 8x compression in text representations for superior efficiency. 📄 Paper: arxiv.org/pdf/2506.21170 💻 Code: github.com/MeshchaninovVi… (1/6)

English

938

Daniil Tiapkin@dtiapkin·21 Kas

Technical report: arxiv.org/abs/2511.16592 GitHub: github.com/d-tiapkin/gfnx Docs: gfnx.readthedocs.io/en/latest/ Huge thanks to @agarkov_as @nvimorozov @ian_maksimov @ashtsyganov @gritsaev @svsamsonov!

English

224

Daniil Tiapkin@dtiapkin·21 Kas

Environments, reward functions, metrics, and single-file implementations – everything you need to achieve up to 80× single-seed speedups for combinatorial object generation, from bit sequences and Ising models to phylogenetic trees.

English

266

Daniil Tiapkin retweetledi

Timofei Gritsaev@gritsaev·7 Eki

1/ Can we efficiently learn the destruction process of diffusion samplers? Can we learn not just the drift, but also the variance for all transition kernels? – We answer YES in our recent paper “Adaptive Destruction Processes for Diffusion Samplers” (Oral at NeurIPS 2025 FPI Workshop).

English

2.6K

Daniil Tiapkin@dtiapkin·20 Eyl

The speedrun is over: I defended my PhD this week and became a doctor in applied mathematics (unofficially: in reinforcement learning)! Huge thanks to my supervisors (Eric & Gilles), collaborators, and friends for all the support.

English

3.1K

Daniil Tiapkin@dtiapkin·18 Tem

If you're at #ICML2025, come say hi and learn about teacher hacking in distillation. See you at poster E-2706!

Daniil Tiapkin@dtiapkin

1/ If you’re familiar with RLHF, you likely heard of reward hacking —where over-optimizing the imperfect reward model leads to unintended behaviors. But what about teacher hacking in knowledge distillation: can the teacher be hacked, like rewards in RLHF?

English

593

Daniil Tiapkin retweetledi

Nikita Morozov@nvimorozov·8 Tem

(1/n) The usual assumption in GFlowNet environments is acyclicity. Have you ever wondered if it can be relaxed? Does the existing GFlowNet theory translate to the non-acyclic case? Is efficient training possible? We shed new light on these questions in our latest work! @icmlconf

English

1.9K

Daniil Tiapkin@dtiapkin·23 Nis

I'll be at #ICLR2025 this week - let's chat about RL, sampling and so on! Excited for @gritsaev to present our work on backward policy optimization for GFlowNets (arxiv.org/abs/2410.15474, my first work as advisor!) on Sat morning, April 26, poster 454. Come to say hi to us!

English

418

Daniil Tiapkin retweetledi

Timofei Gritsaev@gritsaev·7 Mar

1/ GFlowNets are known for training a forward policy to generate complex objects step by step. However, an equally important piece specific to the GFlowNet paradigm is a backward policy, which undoes these steps and plays a crucial role in training.

English

1.5K

Daniil Tiapkin@dtiapkin·8 Şub

@jramapuram In the case of language modeling, KL is computed only over the next token's distribution, but completions' prefixes are different. So, dataset expansion is the simplest way to increase diversity of possible contexts (prompt + prefix) for next-token-KL computations.

English

Jason Ramapuram@jramapuram·8 Şub

Awesome work! Quick question: what does expand dataset with multiple completions from a prompt entail? Doesn’t distillation use a KL between two distributions (here Categoricals) and thus all you do is match natural parameters of distributions? Thus here the probs of the distribution parameterize all possible completions.

English

Daniil Tiapkin@dtiapkin·7 Şub

English

15.2K

Daniil Tiapkin@dtiapkin·7 Şub

6/ Our paper is out: arxiv.org/abs/2502.02671. This work was the result of my internship at @GoogleDeepMind—huge thanks to the team: Daniele Calandriello, @johanferret, @sarah_perrin_, @nino_vieillard, @ramealexandre, @mblondel_ml!

English

633

Daniil Tiapkin@dtiapkin·7 Şub

5/ Our suggestions are the following: - Use online generations during distillation; - Train on more diverse prompt datasets; - Expand the dataset with multiple completions per prompt.

English

437

Daniil Tiapkin@dtiapkin·28 Oca

🔍 Check out our paper arxiv.org/abs/2310.12934 and code github.com/d-tiapkin/gflo… and see you at @aistats_conf in Valencia! Of course, a lot of thanks to my colleagues Nikita Morozov, Alexey Naumov and Dmitry Vetrov!

English

686

Daniil Tiapkin@dtiapkin·28 Oca

Moreover, it turns out that some existing GFlowNet algorithms are well-known RL algorithms under this choice of rewards.

English

667

Daniil Tiapkin@dtiapkin·28 Oca

🌟 News from the GFlowNet world: our paper “Generative Flow Networks as Entropy-Regularized RL” was honored with oral presentation at #AISTATS2024! Long story short, our result can be described by this picture.

English

6.7K

Keşfet

@josephdviviano @NeurIPSConf @agarkov_as @nvimorozov @ian_maksimov @ashtsyganov @gritsaev @svsamsonov