Tim Tomov

421 posts

Tim Tomov banner
Tim Tomov

Tim Tomov

@timtomov

PhD Student in Machine Learning @ DAML (TUM), When do models know that they don’t know ? Doing uncertainty quantification stuff

Munich, German 가입일 Kasım 2015
233 팔로잉70 팔로워
고정된 트윗
Tim Tomov
Tim Tomov@timtomov·
Are LLM outputs just language? We often don’t know what the answer to a question will be - but we do know its structure. Numbers, lists, sets, ... Our framework models outputs in their underlying structure for better answers + uncertainty. w/ @dfuchsgruber @guennemann [1/4]🧵
Tim Tomov tweet media
English
1
4
6
331
Tim Tomov 리트윗함
Andrew Gordon Wilson
Andrew Gordon Wilson@andrewgwils·
A reminder that you cannot learn from data without making assumptions. You should never apologize for making assumptions. Embrace them, be transparent about what they are, and enabled by those assumptions, update based on new information.
English
5
16
178
7.9K
Tim Tomov 리트윗함
Haitham Bou Ammar
Haitham Bou Ammar@hbouammar·
I want to take this post to thank the amazing people of X for reviewing and reading our paper: arxiv.org/pdf/2602.18292… - where we claim that decoding is an optimisation layer on top of LLMs. Thank you for letting us know which references we missed. We are adding those and will update the arXiv soon. So far, the list is: @amritsinghbedi3 Controlled decoding from language models arxiv.org/pdf/2310.17022 @amritsinghbedi3 Transfer Q⋆: Principled Decoding for LM Alignment arxiv.org/pdf/2405.20495 @elon_lit Scaled-Dot-Product Attention as One-Sided Entropic Optimal Transport arxiv.org/abs/2508.08369 @timtomov Task-Awareness Improves LLM Generations and Uncertainty arxiv.org/pdf/2601.21500 Let me know if I missed any! Please keep them coming! Would be happy to cite the missing refs! Also make sure to read those papers above, I think they are super awesome :D #AI #MachineLearning
Haitham Bou Ammar tweet media
English
2
3
45
3K
Tim Tomov
Tim Tomov@timtomov·
@hbouammar Looks really interesting! 2 things I will check out are how MBR (minimum bayes risk) decoding fits into this, and thinking about an application of your ideas in a space different from the token space. (We recently proposed MBR decoding in such a space arxiv.org/pdf/2601.21500)
English
1
0
4
347
Haitham Bou Ammar
Haitham Bou Ammar@hbouammar·
I am happy to announce our paper, where we formalise LLM decoding as a special case of optimisation over a probability simplex. 📜arxiv.org/pdf/2602.18292 This is work in progress! How can you help? 1⃣ We probably missed citations! If you think your work is relevant to this topic, make sure to leave a comment with the paper that we should cite. We will review it and update the arXiv. 2⃣ Try to derive new special cases. We will review that if that works, we are happy to add you to our team! Why not work together? We can do more! #AI #machinelearning
Haitham Bou Ammar tweet media
English
4
20
140
8.1K
Tim Tomov 리트윗함
Machine Learning Street Talk
Machine Learning Street Talk@MLStreetTalk·
Interesting research from Anthropic: When you have increasingly large models and increasingly complex tasks it's more likely that the models will give you different answers if you run the same query multiple times. On easy tasks, larger models actually become more coherent. Think of a "cone" of possible trajectories and the branching factor gets bigger with more possibilities (due the larger models "knowing more options to explore" and more complex problems having more "possible aspects"). The amount of time reasoning (trajectory length) then makes it multiplicatively more incoherent at the end state. Having a large model with an easy task means the correct answer is definitely "in there" and it's less likely to become distracted. They are arguing this is relevant for AI safety because some might have assumed that larger models would have convergent "instrumental goals" and would give a consistently wrong rather than randomly wrong answer. Apparently the "the hot mess theory of intelligence" (Sohl-Dickstein, 2023) argues that "as entities become more intelligent, their behaviour tends to become more incoherent, and less well described through a single goal."
Machine Learning Street Talk tweet media
Anthropic@AnthropicAI

New Anthropic Fellows research: How does misalignment scale with model intelligence and task complexity? When advanced AI fails, will it do so by pursuing the wrong goals? Or will it fail unpredictably and incoherently—like a "hot mess?" Read more: alignment.anthropic.com/2026/hot-mess-…

English
80
204
1.7K
170.3K
Tim Tomov
Tim Tomov@timtomov·
Better uncertainty. We similarly outperform existing UQ methods across a wide range of tasks. The key: variation ≠ uncertainty. Our Bayes risk is task-aware and distance-sensitive, capturing uncertainty with higher granularity. 📝 arxiv.org/abs/2601.21500 [4/4]🧵
Tim Tomov tweet media
English
0
0
0
50
Tim Tomov
Tim Tomov@timtomov·
Better answers. The Bayes-optimal answer outperforms other decoding methods across a wide range of tasks. The key: We leverage information from the full predictive distribution and even enable the synthesis of new responses beyond those produced by the model [3/4]🧵
Tim Tomov tweet media
English
1
0
0
73
Tim Tomov
Tim Tomov@timtomov·
Are LLM outputs just language? We often don’t know what the answer to a question will be - but we do know its structure. Numbers, lists, sets, ... Our framework models outputs in their underlying structure for better answers + uncertainty. w/ @dfuchsgruber @guennemann [1/4]🧵
Tim Tomov tweet media
English
1
4
6
331
Tim Tomov 리트윗함
dr. jack morris
dr. jack morris@jxmnop·
Wondering how to attend an ML conference the right way? ahead of NeurIPS 2025 (30k attendees!) here are ten pro tips: 1. Your main goals: (i) meet people (ii) regain excitement about work (iii) learn things – in that order. 2. Make a list of papers you like and seek them out at poster sessions. Try to talk to the authors– you can learn much more from them than from a PDF. 3. Pick one workshop and one tutorial that sounds most interesting. Skip the rest. 4. Cold email people you want to meet but haven't. Check Twitter and the accepted papers list. PhD students are especially responsive. 5. Practice a concise pitch of unpublished research you're working on for "what are you interested in rn?". Focus on big unanswered questions and exciting new directions, *not* papers. 6. Skip the orals. Posters are a higher-bandwidth, more engaging, more invigorating. Orals are a good time to go for a walk or talk in the hallway. 7. for the love of god, do NOT work on other research in your hotel room. Save mental bandwidth for the conference. (This may seem obvious; you'd be surprised.) 8. Talk to people outside your area. There are many smart people working on niches <10 people understand. Learn about one or two that won't help your own work. 9. Attend one social each night. Don't overthink it or get caught up in status games. They're all fun. 10. Take breaks. You can't go to everything, and conferences consume more energy than a normal workweek. hope this helps, and sad i'm not attending neurips, have fun :)
dr. jack morris tweet media
English
28
129
1.5K
136.4K
Tim Tomov 리트윗함
Amine Ketata
Amine Ketata@amine_ketata·
Excited to share that my first PhD paper, which introduces a new diffusion model for relational databases, has been accepted to #NeurIPS2025! We will be presenting it this week in San Diego. ☀️🌴 Joint work with @ludke_david, @SchwinnLeo, and @guennemann. 🧵 1/
Amine Ketata tweet media
English
1
8
12
1.5K
Tim Tomov 리트윗함
Filippo Guerranti @ NeurIPS25
Heading to San Diego for #NeurIPS2025! 🌴☀️ I’ll be presenting 3 recent papers covering generative models for hierarchies, spatiotemporal tissue dynamics, and long-range graph learning. If you're around, drop by and say hi! 👋 Here’s the schedule 🧵👇
English
1
13
22
1.3K
Tim Tomov
Tim Tomov@timtomov·
Found this great post by @alemi: blog.alexalemi.com/kl-is-all-you-…. It shows how many ML methods boil down to minimising the KL between the joint of the real causal world and the world we want (e.g., real images → latents, but we want latents → images (VAE))
English
0
0
0
53
Tim Tomov
Tim Tomov@timtomov·
These results reveal a deeper issue: current uncertainty estimators only work when the world is unambiguous. But language never is. To make uncertainty reliable, we need to rethink current paradigms. arxiv.org/abs/2511.04418 [4/4]🧵
English
0
0
3
141
Tim Tomov
Tim Tomov@timtomov·
Why do these estimators seem to work? For single answers, the true distribution is one-hot. We theoretically show that entropy and mutual information track model uncertainty (EU). Once ambiguity enters, the truth can lie anywhere on the simplex — and the link breaks. [3/4]🧵
Tim Tomov tweet media
English
1
0
3
181
Tim Tomov
Tim Tomov@timtomov·
Can we actually tell when LLMs know or don’t know? For questions with a single answer, that works. But once ambiguity enters - several answers are correct - current methods collapse, confusing model with data uncertainty. w/ @dfuchsgruber @TomWollschlager @guennemann [1/4]🧵
Tim Tomov tweet media
English
2
11
19
1.5K
Tim Tomov 리트윗함
David Lüdke
David Lüdke@ludke_david·
Open-source Diffusion LLMs easily break GPT-5! In “Diffusion LLMs are Natural Adversaries for any LLM,” we show that Inpainting on Diffusion LLMs yields efficient, transferable jailbreaks without any model access. @TomWollschlager, Paul Ungermann, @guennemann, @leoschwinn 🧵
David Lüdke tweet media
English
1
15
18
2.1K