Ankit Dhall
263 posts

Ankit Dhall
@ankitdhall
Sr. Deep Learning Engineer @NVIDIA Visual Gen AI | DL Algorithms Prev. @latticeflowai Seervision @ETH @amzracing @motionaldrive @UniFreiburg @iiit_hyderabad VIT






Applications change, but the principles are enduring. After a year's hard work led by @JCJesseLai, we are really excited to share this deep, systematic dive into the mathematical principles of diffusion models. This is a monograph we always wished we had.


New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: alexiajm.github.io/2025/09/29/tin… Code: github.com/SamsungSAILMon… Paper: arxiv.org/abs/2510.04871



We did a very careful study of 10 optimizers with no horse in the race. Despite all the excitement about Muon, Mars, Kron, Soap, etc., at the end of the day, if you tune the hyperparameters rigorously and scale up, the speedup over AdamW diminishes to only 10% :-( Experiments are made possible by Marin (github.com/marin-communit…); anyone developing new optimizers: please come try your method on this benchmark!




NVIDA chips are manufactured by TSMC, a Taiwanese company. They're created using EUV lithography machines manufactured by ASML, a Dutch company. These machines consist of >50% of German parts (by value), in particular ZEISS optics.










