Petar Veličković

3.3K posts

Petar Veličković banner
Petar Veličković

Petar Veličković

@PetarV_93

Senior Staff Research Scientist @GoogleDeepMind | Affiliated Lecturer @Cambridge_Uni | Assoc @clarehall_cam | GDL Scholar @ELLISforEurope. Monoids. 🇷🇸🇲🇪🇧🇦

London 🇬🇧 Katılım Ocak 2013
558 Takip Edilen44.1K Takipçiler
Sabitlenmiş Tweet
Petar Veličković
Petar Veličković@PetarV_93·
These days, I receive a lot of email 📬 and realise my communication could be significantly more effective. Please see petar-v.com/contact.html for optimal ways to reach out 📨, and FAQ answers❓(including pointers to materials on GNNs / GDL, and research / internship advice!).
Petar Veličković tweet media
English
11
21
215
0
Petar Veličković retweetledi
Christopher Morris
Christopher Morris@chrsmrrs·
We recently got excited about understanding the theoretical foundations of neural algorithmic reasoning (@PetarV_93 ). To start off, we will present the following three works at ICML 2026 🇰🇷:
English
4
8
64
9.8K
Petar Veličković
Petar Veličković@PetarV_93·
in between his research being published in the nature portfolio, @dharshsky also ships icml bangers! this will also be one of the papers we'll present in korea 🇰🇷 this summer at icml'26. looking forward to it :)
Petar Veličković@PetarV_93

new preprint: investigating pathways language models use to verbalise their confidence! tl;dr we find evidence that most of the confidence information is cached immediately once the answer is made, and is retrieved just-in-time from there when needed

English
0
2
32
4K
Petar Veličković
Petar Veličković@PetarV_93·
three papers (one spotlight) accepted at icml'26!!! see you in seoul 🎉 it will be my first ai conf in 2 years!
English
6
1
134
7.9K
Petar Veličković retweetledi
Dharshan Kumaran
Dharshan Kumaran@dharshsky·
Are LLMs stubborn or oversensitive to pushback? Both—at once. Our new @NatMachIntell paper helps us understand why LLMs become more confident in their initial answers, and how they markedly overweight opposing advice. Open access at: rdcu.be/feOjz @GoogleDeepMind × @UCL_ICN
Dharshan Kumaran tweet media
English
0
3
12
2.5K
Petar Veličković
Petar Veličković@PetarV_93·
oh, did i say chapter? i meant _chapters_ We've just released draft Chapter 6 (Grids) and Chapter 7 (Group Convolutions on Homogeneous Spaces) of the GDL Book Alice's journeys in geometric wonderland continue #️⃣🌍
Petar Veličković tweet mediaPetar Veličković tweet media
English
3
10
62
8K
Petar Veličković
Petar Veličković@PetarV_93·
@StefanGliga hopefully we've finally entered the final stretch! thanks for the support, hope you like them 🙏
English
0
0
1
149
Stefan
Stefan@StefanGliga·
@PetarV_93 Looks like Christmas came early!
English
1
0
1
180
Petar Veličković
Petar Veličković@PetarV_93·
This paper kicked off our team's studies into the intricate relationship LLMs have with confidence. Now landed in @NatMachIntell 🚀 Give it a read, esp if you enjoy CogSci-style analyses of LLMs 🧠 Thoroughly impressed by Dharsh's leadership on this work! More outputs soon 👀
Petar Veličković tweet media
English
1
10
43
5.2K
Petar Veličković
Petar Veličković@PetarV_93·
new preprint! turns out, if your model is confident on _any_ long enough input, we can find other inputs where the model is wrong, yet its perplexity won't really tell you it's wrong 📉 work with @fedzbar @ccperivol @sindero and Razvan
Petar Veličković tweet media
English
12
80
655
76.1K
Petar Veličković
Petar Veličković@PetarV_93·
thanks for calling this out! the continuity proof in pasten et al. (which our result relies on) actually requires compactness, not boundedness. as i understand their proof, given a compact input embedding and pe, for any fixed number of layers, they show that the intermediate representations will stay compact as well (and then use this to prove continuity). and as i hypothesised in the above, the caveat of allowing dynamic depth (i.e. varying the number of layers per-input so that it can be any arbitrary integer) is that, if one wields it right, you can in principle have different levels of 'how bounded the final-layer representations are' for different inputs, and use this to escape the continuity cage. as you very rightfully note, the floating-point standard itself imposes a bound on the possible values that cannot be broken without dynamically expanding the representation as well. though i would say that in modern architectures we hit issues related to continuity well before we hit issues of fp bounds, so i guess the dynamic depth could still help even if we don't fix that :)
English
0
0
1
52
Andrew Critch (🤖🩺🚀)
Andrew Critch (🤖🩺🚀)@AndrewCritchPhD·
It sounds like you mean bounded, not compact. A bounded convex hull is not necessarily compact, buts its closure is (as it would be in davidad's example). Am I reading you right that you mean non-compact closure? If so, that would require expanding hardware allocation training time; otherwise float precision will implicitly bound everything. (We already have an unbounded "weight" space at inference time, if you count the growing QKV tensor as "weights".)
English
1
0
1
55
Shane Gu
Shane Gu@shaneguML·
10 years ago today, we lost Sir David MacKay FRS. Physicist. Mathematician. Polymath. Gone at 48. I was working on my PhD at Cambridge, and attended some of his last lectures and symposium. He was a reason that attracted me to Cambridge over MIT in 2014. His textbook, Information Theory, Inference, and Learning Algorithms, was the first ML book I ever read — recommended to me by none other than Geoff Hinton. He used that same information theory to build Dasher — a text entry system where users steer through a continuous stream of letters flowing toward them, with a probabilistic language model making likely next letters larger and easier to reach, so that any tiny movement — a finger, a gaze — becomes efficient writing. It was the first ML application that truly blew my mind, and sent me deep into a rabbit hole: arithmetic coding, PAQ8 compression, nonparametric models. A journey I partly owe to his PhD student Christian Steinruecken, who also happened to share my love of Japan. As Chief Scientific Advisor to the UK's Department of Energy & Climate Change, he brought a physicist's clarity to policy. In Sustainable Energy – Without the Hot Air, he ran the numbers on our entire energy diet — and made me confront an uncomfortable truth. One of the biggest single factors? Beef — roughly 1,000 days of cow-time per steak. Hard to argue with the data. Hard to act on it when you were born and raised in Japan. I'm still working on that one, David. At his final symposium in Cambridge — just a few weeks before his passing — the room told the full story. Geoff Hinton and his Caltech PhD advisor John Hopfield — both Nobel Prize winners in Physics 2024 — gave tributes. Environment policy advisers spoke. Dasher users sent video messages of thanks from around the world — people who found their voice because of him. It was extraordinary to witness, in one room, just how many minds and lives a single person had touched. The story of how Hinton first noticed him: at a conference workshop poster session, among everyone who stopped by, it was the young MacKay who asked the sharpest, most penetrating question. Hinton remembered it. That's how it begins. I've always liked physicists who cross into ML — they bring a groundedness, a refusal to hide behind formalism without meaning. David MacKay and Max Welling are the role models I point to. Not just for the mathematics they built, but for how they carried it: with humility, curiosity, and a stubborn insistence on reaching beyond academia. He seemed to know his time was limited, and gave everything anyway. His legacy stays.
Shane Gu tweet media
GIF
Shane Gu tweet mediaShane Gu tweet media
English
19
94
743
109.7K
Petar Veličković
Petar Veličković@PetarV_93·
@kfountou true story, on the very first algo reasoning paper I wrote (in 2019) we had an iclr reviewer give a score of 1 because "why do you bother learning algorithms? are these algorithms broken and need to be repaired?" 😅
English
1
0
66
8.1K
Kimon Fountoulakis
Kimon Fountoulakis@kfountou·
I just had a grant for GPUs rejected (6 GPUs in total, shared across the whole department) solely because I used the term “algorithmic reasoning.” The reviewer spent about 90% of their review trying to educate me on how “anthropomorphizing” neural networks does a “disservice to Science” (their exact words, with “Science” capitalized by them). I’m glad I’m not the only one.👇
Edward Hu@edwardjhu

In 2023, I paused my PhD to join @OpenAI to build the world’s first reasoning machine — OpenAI o1. Earlier this year, I defended my PhD thesis “Building a Reasoning Machine” advised by @Yoshua_Bengio at @Mila_Quebec 🎓 🎉 Much has changed since Yoshua and I first discussed reasoning in 2022, but the main themes aged well: - Adding structures to computation unlocks strong reasoning capabilities; - Data & sample efficiency will become the bottleneck to useful intelligence; - Retaining Bayesian uncertainty is key to reliable and safe AI systems. You can read the introduction of my thesis here: edwardjhu.com/thesis/ My next professional chapter (TBA) will be on bridging frontier intelligence with real economic impact, a theme dear to my heart after working closely with @drwconvexity and @suna_said in the last year 🚀

English
15
11
607
137.8K