Yuval Atzmon

216 posts

Yuval Atzmon

Yuval Atzmon

@AtzmonYuval

Research Scientist @NVIDIA, Generative AI, reasoning across different data modalities, and compositionality with few or zero examples. Opinions are my own.

London, United Kingdom เข้าร่วม Aralık 2016
194 กำลังติดตาม546 ผู้ติดตาม
Roei Herzig
Roei Herzig@roeiherzig·
Excited to be at @iclr_conf ICLR 2026 in Rio next week 🌴🇧🇷✨ I’ll be presenting two of our papers: 🤖 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝘁𝗼 𝗚𝗿𝗮𝘀𝗽 𝗔𝗻𝘆𝘁𝗵𝗶𝗻𝗴 𝗯𝘆 𝗣𝗹𝗮𝘆𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗥𝗮𝗻𝗱𝗼𝗺 𝗧𝗼𝘆𝘀 lego-grasp.github.io ->We show that training robots on random toys enables 𝙯𝙚𝙧𝙤-𝙨𝙝𝙤𝙩 𝙜𝙧𝙖𝙨𝙥𝙞𝙣𝙜 of real-world objects. 📄🌐 𝗗𝗔𝗩𝗘: 𝗔 𝗩𝗟𝗠 𝗩𝗶𝘀𝗶𝗼𝗻 𝗘𝗻𝗰𝗼𝗱𝗲𝗿 𝗳𝗼𝗿 𝗗𝗼𝗰𝘂𝗺𝗲𝗻𝘁 𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱𝗶𝗻𝗴 𝗮𝗻𝗱 𝗪𝗲𝗯 𝗔𝗴𝗲𝗻𝘁𝘀. ->We developed a vision encoder purpose-built for VLMs and tailored to 𝙙𝙤𝙘𝙪𝙢𝙚𝙣𝙩 𝙪𝙣𝙙𝙚𝙧𝙨𝙩𝙖𝙣𝙙𝙞𝙣𝙜 and 𝙬𝙚𝙗 𝙖𝙜𝙚𝙣𝙩𝙨. If you’ll be there, let’s connect! 🚀 #iclr2026 #PhysicalAI #agents
English
1
2
29
2.3K
Yuval Atzmon รีทวีตแล้ว
Gal Dalal
Gal Dalal@DalalGal·
1/ More test-time compute can actually hurt LLM reasoning. ⚠️ Beam search is often treated as a free lunch: wider beam, more candidates, better answers. In our new paper, we show that after a certain point, the opposite can happen.
Gal Dalal tweet media
English
1
3
13
364
Yuval Atzmon
Yuval Atzmon@AtzmonYuval·
A surprising finding - instead of learning spatial patterns - classifiers cheat by locking-into linguistic traces from the prompt leaking to the attention maps. For that, we inverted each training image with both correct and incorrect relation, so classifiers can't take shortcuts
English
0
0
0
37
Yuval Atzmon
Yuval Atzmon@AtzmonYuval·
The cool thing - despite training on single spatial relations, Learn-to-Steer generalizes to multiple relations in a single image - even 5 objects with 3 relations. And it works across different diffusion architectures including MMDiT
Yuval Atzmon tweet media
English
1
0
0
44
Yuval Atzmon
Yuval Atzmon@AtzmonYuval·
This was a fun work, led by the skilled @sapiryiflach T2I models struggle with spatial reasoning - "a dog to the right of a cat" often comes out wrong. Instead of handcrafting a test-time loss func, we trained a classifier on attention maps and used it as a differentiable loss
Sapir Yiflach@sapiryiflach

🚀Excited to present our new paper that has been accepted to #WACV2026! Text-to-image models often fail at simple spatial tasks, like placing a dog to the right of a teddy bear. Our solution: Learn-to-Steer. We learn a loss function directly from attention maps and apply it during inference. This work was done together with @AtzmonYuval and @GalChechik 📰arXiv: arxiv.org/abs/2509.02295 🌐Project page: learn-to-steer-paper.github.io 📽️Video: youtu.be/KaxRwlE-UFg

English
1
1
4
673
Yuval Atzmon รีทวีตแล้ว
Sapir Yiflach
Sapir Yiflach@sapiryiflach·
🚀Excited to present our new paper that has been accepted to #WACV2026! Text-to-image models often fail at simple spatial tasks, like placing a dog to the right of a teddy bear. Our solution: Learn-to-Steer. We learn a loss function directly from attention maps and apply it during inference. This work was done together with @AtzmonYuval and @GalChechik 📰arXiv: arxiv.org/abs/2509.02295 🌐Project page: learn-to-steer-paper.github.io 📽️Video: youtu.be/KaxRwlE-UFg
YouTube video
YouTube
Sapir Yiflach tweet media
English
0
5
9
931
Yuval Atzmon รีทวีตแล้ว
Dvir Samuel
Dvir Samuel@dvir_samuel·
🚀 Excited to share our new paper: “Fast Autoregressive Video Diffusion & World Models with Temporal Cache Compression & Sparse Attention.” We address attention bottlenecks in auto-regressive video diffusion, enabling ×5–×10 speedup and constant memory over long rollouts.
English
2
18
34
688
Yuval Atzmon รีทวีตแล้ว
Chen Tessler
Chen Tessler@ChenTessler·
At @nvidia, we built ProtoMotions to help us, and researchers world-wide, innovate quickly without compromising on applicability. We're proud to announce ProtoMotions3 -- our biggest release yet! 🧵👇
English
8
55
269
52.3K
Yuval Atzmon รีทวีตแล้ว
Assaf Shocher
Assaf Shocher@AssafShocher·
They tell you neural nets are non-linear. What does "linear" even mean?! Linearity is only defined given two vector spaces, X → Y. What if we could find a different pair of spaces where NNs ARE linear? 🤯 We do it and use it for many apps, such as one-step diffusion! 🧵
Assaf Shocher tweet media
English
22
60
562
45K
Or Patashnik
Or Patashnik@OPatashnik·
📢 Today I begin my first semester as faculty in Computer Science at @TelAvivUni! Excited to start this new journey, and grateful to teach & research where my own journey began 🩵
Or Patashnik tweet mediaOr Patashnik tweet media
English
56
6
347
27K
Yuval Atzmon รีทวีตแล้ว
Bryan Catanzaro
Bryan Catanzaro@ctnzr·
Today we're releasing NVIDIA Nemotron Nano v2 - a 9B hybrid SSM that is 6X faster than similarly sized models, while also being more accurate. Along with this model, we are also releasing most of the data we used to create it, including the pretraining corpus. Links to the models, datasets, and tech report are here: research.nvidia.com/labs/adlr/NVID…
Bryan Catanzaro tweet media
English
39
232
1.4K
275.9K
Yuval Atzmon
Yuval Atzmon@AtzmonYuval·
@ziv_ravid Did you try with markdown? There's an option to code slides using markdown syntax. This would give you instant iteration of you're using cursor and a vscode plugin. I was planning to try it next time I'll prepare a presentation
English
2
0
2
104
Ravid Shwartz Ziv
Ravid Shwartz Ziv@ziv_ravid·
I ended up with ~70 slides, which meant the context window (and my token limit) filled up soooo fast all the time (I had to paste relevant papers each time). Lots of back-and-forth with the model (asking it to generate LaTeX code, copying to Overleaf, checking results).
English
2
0
1
229
Mehul Damani @ICLR
Mehul Damani @ICLR@MehulDamani2·
🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty -- improving both accuracy ✅ and calibration 🎯. [1/N]
Mehul Damani @ICLR tweet media
English
14
205
839
111.6K
Yuval Atzmon รีทวีตแล้ว
UriG
UriG@uri_gadot·
Tired of manual #ComfyUI workflow design? While recent methods predict them, our new paper, FlowRL, introduces a Reinforcement Learning framework that learns to generate complex, novel workflows for you! paper [arxiv.org/abs/2505.21478]
UriG tweet media
English
2
6
15
5.1K
Yuval Atzmon รีทวีตแล้ว
Gal Dalal
Gal Dalal@DalalGal·
1/4 🚨 1st of 3 ICML 2025 papers! We bring gradient boosting trees (like XGBoost) to RL — live on real datacenters. Our GBRL framework is robust, efficient, and deployable on lightweight hardware — even RISC-V CPUs 💻 🧵👇
English
1
2
10
1.5K
Yuval Atzmon รีทวีตแล้ว
Yftah Ziser
Yftah Ziser@YftahZ·
1/6🚀 New #ACL2025Findings: We show you can predict if Chain-of-Thought (CoT) reasoning will succeed — before any tokens are generated! This works with LLMs not specifically trained for reasoning—meaning powerful signals emerge naturally in early processing.
English
1
7
35
7.3K
Yuval Atzmon
Yuval Atzmon@AtzmonYuval·
Tomorrow, 3pm, #ICLR2025, super creative work by @YoadTewel. Adding objects to images, in natural ways, just from text prompts. It's completely zero-shot, and also resonates with "affordance" in vision, robotics and CogSci. I'll be there too. Come say 👋!
Yoad Tewel@YoadTewel

I'm going to present Add-it at #ICLR2025 tomorrow (Thursday) @ 3pm - poster #163! Project page: research.nvidia.com/labs/par/addit/ If you're around this week, feel free to DM me - happy to chat! Details below ⬇️🧵

English
0
0
5
254