Petar Veličković

3.3K posts

Petar Veličković

@PetarV_93

Senior Staff Research Scientist @GoogleDeepMind | Affiliated Lecturer @Cambridge_Uni | Assoc @clarehall_cam | GDL Scholar @ELLISforEurope. Monoids. 🇷🇸🇲🇪🇧🇦

London 🇬🇧 Katılım Ocak 2013

558 Takip Edilen44.1K Takipçiler

Sabitlenmiş Tweet

Petar Veličković@PetarV_93·31 Tem

These days, I receive a lot of email 📬 and realise my communication could be significantly more effective. Please see petar-v.com/contact.html for optimal ways to reach out 📨, and FAQ answers❓(including pointers to materials on GNNs / GDL, and research / internship advice!).

English

215

Petar Veličković@PetarV_93·3d

@chrsmrrs very cool stuff! congrats to you and your team!

English

622

Petar Veličković retweetledi

Christopher Morris@chrsmrrs·3d

We recently got excited about understanding the theoretical foundations of neural algorithmic reasoning (@PetarV_93 ). To start off, we will present the following three works at ICML 2026 🇰🇷:

English

9.8K

Petar Veličković@PetarV_93·4d

in between his research being published in the nature portfolio, @dharshsky also ships icml bangers! this will also be one of the papers we'll present in korea 🇰🇷 this summer at icml'26. looking forward to it :)

Petar Veličković@PetarV_93

new preprint: investigating pathways language models use to verbalise their confidence! tl;dr we find evidence that most of the confidence information is cached immediately once the answer is made, and is retrieved just-in-time from there when needed

English

Petar Veličković@PetarV_93·4d

this work is now accepted at icml! congratulations @fedzbar @gu_xiangming and the entire team! we look forward to chatting about it in seoul. @iliaishacked so happy we finally published together 👀

Federico Barbero@fedzbar

🚨🌶️ Did you realise you can get alignment `training’ data out of open weights models? Oops We show that models will regurgitate alignment data that is (semantically) memorised. This data can come from SFT and RL... and can be used to train your own models! 🧵

English

4.3K

Petar Veličković@PetarV_93·4d

three papers (one spotlight) accepted at icml'26!!! see you in seoul 🎉 it will be my first ai conf in 2 years!

English

134

7.9K

Petar Veličković retweetledi

Bruno Neri@neribr·24 Nis

Keep going down the rabbit-hole with Alice guided by @PetarV_93 @mmbronstein @TacoCohen #geometricdeeplearing

Petar Veličković@PetarV_93

oh, did i say chapter? i meant _chapters_ We've just released draft Chapter 6 (Grids) and Chapter 7 (Group Convolutions on Homogeneous Spaces) of the GDL Book Alice's journeys in geometric wonderland continue #️⃣🌍

English

2.6K

Petar Veličković retweetledi

Dharshan Kumaran@dharshsky·22 Nis

Are LLMs stubborn or oversensitive to pushback? Both—at once. Our new @NatMachIntell paper helps us understand why LLMs become more confident in their initial answers, and how they markedly overweight opposing advice. Open access at: rdcu.be/feOjz @GoogleDeepMind × @UCL_ICN

English

2.5K

Petar Veličković@PetarV_93·23 Nis

@aniervs @TacoCohen we can see the finish line!!

English

125

Anier Velasco Sotomayor@aniervs·23 Nis

@PetarV_93 @TacoCohen I was about to email you guys! haha

English

124

Petar Veličković@PetarV_93·23 Nis

English

Petar Veličković@PetarV_93·23 Nis

@StefanGliga hopefully we've finally entered the final stretch! thanks for the support, hope you like them 🙏

English

149

Stefan@StefanGliga·23 Nis

@PetarV_93 Looks like Christmas came early!

English

180

Petar Veličković@PetarV_93·23 Nis

Available now at: geometricdeeplearning.com/book where it will stay, free of charge :) errata, exercise suggestions etc are very welcome, as always, and will be credited in our acknowledgements! @mmbronstein @TacoCohen

English

6.2K

Petar Veličković@PetarV_93·22 Nis

Read it at: nature.com/articles/s4225… @re_rayne @bobthemaster @sindero

English

1.2K

Petar Veličković@PetarV_93·22 Nis

This paper kicked off our team's studies into the intricate relationship LLMs have with confidence. Now landed in @NatMachIntell 🚀 Give it a read, esp if you enjoy CogSci-style analyses of LLMs 🧠 Thoroughly impressed by Dharsh's leadership on this work! More outputs soon 👀

English

5.2K

Petar Veličković@PetarV_93·21 Nis

@neribr this week if all goes well...

English

270

Bruno Neri@neribr·21 Nis

@PetarV_93 Nice news! Some deadline in mind?

English

297

Petar Veličković@PetarV_93·21 Nis

new book chapter in the air

English

7.1K

Petar Veličković@PetarV_93·21 Nis

@mmbronstein 🫳🎤

QME

619

Michael Bronstein@mmbronstein·21 Nis

Come and see our papers at #ICLR2026 in Rio!

English

132

8.8K

Petar Veličković@PetarV_93·17 Nis

@GalladeGuy123 @fedzbar @ccperivol @sindero i'd need to read this paper (thanks for sharing it!) but i think compactness is pretty important. fwiw, nope should be compact by definition :)

English

GalladeGuy@GalladeGuy123·17 Nis

@PetarV_93 @fedzbar @ccperivol @sindero I'm curious how load-bearing the CPE requirement is. Maybe PE-free transformer variants like the one in this paper could avoid this issue: arxiv.org/abs/2601.15380

English

Petar Veličković@PetarV_93·2 Şub

new preprint! turns out, if your model is confident on _any_ long enough input, we can find other inputs where the model is wrong, yet its perplexity won't really tell you it's wrong 📉 work with @fedzbar @ccperivol @sindero and Razvan

English

655

76.1K

Petar Veličković@PetarV_93·15 Nis

thanks for calling this out! the continuity proof in pasten et al. (which our result relies on) actually requires compactness, not boundedness. as i understand their proof, given a compact input embedding and pe, for any fixed number of layers, they show that the intermediate representations will stay compact as well (and then use this to prove continuity). and as i hypothesised in the above, the caveat of allowing dynamic depth (i.e. varying the number of layers per-input so that it can be any arbitrary integer) is that, if one wields it right, you can in principle have different levels of 'how bounded the final-layer representations are' for different inputs, and use this to escape the continuity cage. as you very rightfully note, the floating-point standard itself imposes a bound on the possible values that cannot be broken without dynamically expanding the representation as well. though i would say that in modern architectures we hit issues related to continuity well before we hit issues of fp bounds, so i guess the dynamic depth could still help even if we don't fix that :)

English

Andrew Critch (🤖🩺🚀)@AndrewCritchPhD·15 Nis

It sounds like you mean bounded, not compact. A bounded convex hull is not necessarily compact, buts its closure is (as it would be in davidad's example). Am I reading you right that you mean non-compact closure? If so, that would require expanding hardware allocation training time; otherwise float precision will implicitly bound everything. (We already have an unbounded "weight" space at inference time, if you count the growing QKV tensor as "weights".)

English

Petar Veličković@PetarV_93·15 Nis

@shaneguML Was very fortunate to attend his final lecture in the Computer Lab, at the very start of my PhD. x.com/PetarV_93/stat…

Petar Veličković@PetarV_93

Exactly 7 years ago, I attended my first ML event @Cambridge_CL, celebrating David MacKay's work. Among many great talks (incl. @geoffreyhinton), the highlight was the surprise "night-time" information theory talk from Prof MacKay himself. I believe this was his last talk. RIP.

English

2.3K

Shane Gu@shaneguML·15 Nis

10 years ago today, we lost Sir David MacKay FRS. Physicist. Mathematician. Polymath. Gone at 48. I was working on my PhD at Cambridge, and attended some of his last lectures and symposium. He was a reason that attracted me to Cambridge over MIT in 2014. His textbook, Information Theory, Inference, and Learning Algorithms, was the first ML book I ever read — recommended to me by none other than Geoff Hinton. He used that same information theory to build Dasher — a text entry system where users steer through a continuous stream of letters flowing toward them, with a probabilistic language model making likely next letters larger and easier to reach, so that any tiny movement — a finger, a gaze — becomes efficient writing. It was the first ML application that truly blew my mind, and sent me deep into a rabbit hole: arithmetic coding, PAQ8 compression, nonparametric models. A journey I partly owe to his PhD student Christian Steinruecken, who also happened to share my love of Japan. As Chief Scientific Advisor to the UK's Department of Energy & Climate Change, he brought a physicist's clarity to policy. In Sustainable Energy – Without the Hot Air, he ran the numbers on our entire energy diet — and made me confront an uncomfortable truth. One of the biggest single factors? Beef — roughly 1,000 days of cow-time per steak. Hard to argue with the data. Hard to act on it when you were born and raised in Japan. I'm still working on that one, David. At his final symposium in Cambridge — just a few weeks before his passing — the room told the full story. Geoff Hinton and his Caltech PhD advisor John Hopfield — both Nobel Prize winners in Physics 2024 — gave tributes. Environment policy advisers spoke. Dasher users sent video messages of thanks from around the world — people who found their voice because of him. It was extraordinary to witness, in one room, just how many minds and lives a single person had touched. The story of how Hinton first noticed him: at a conference workshop poster session, among everyone who stopped by, it was the young MacKay who asked the sharpest, most penetrating question. Hinton remembered it. That's how it begins. I've always liked physicists who cross into ML — they bring a groundedness, a refusal to hide behind formalism without meaning. David MacKay and Max Welling are the role models I point to. Not just for the mathematics they built, but for how they carried it: with humility, curiosity, and a stubborn insistence on reaching beyond academia. He seemed to know his time was limited, and gave everything anyway. His legacy stays.

GIF

English

743

109.7K

Petar Veličković@PetarV_93·10 Nis

@kfountou true story, on the very first algo reasoning paper I wrote (in 2019) we had an iclr reviewer give a score of 1 because "why do you bother learning algorithms? are these algorithms broken and need to be repaired?" 😅

English

8.1K

Kimon Fountoulakis@kfountou·10 Nis

I just had a grant for GPUs rejected (6 GPUs in total, shared across the whole department) solely because I used the term “algorithmic reasoning.” The reviewer spent about 90% of their review trying to educate me on how “anthropomorphizing” neural networks does a “disservice to Science” (their exact words, with “Science” capitalized by them). I’m glad I’m not the only one.👇

Edward Hu@edwardjhu

In 2023, I paused my PhD to join @OpenAI to build the world’s first reasoning machine — OpenAI o1. Earlier this year, I defended my PhD thesis “Building a Reasoning Machine” advised by @Yoshua_Bengio at @Mila_Quebec 🎓 🎉 Much has changed since Yoshua and I first discussed reasoning in 2022, but the main themes aged well: - Adding structures to computation unlocks strong reasoning capabilities; - Data & sample efficiency will become the bottleneck to useful intelligence; - Retaining Bayesian uncertainty is key to reliable and safe AI systems. You can read the introduction of my thesis here: edwardjhu.com/thesis/ My next professional chapter (TBA) will be on bridging frontier intelligence with real economic impact, a theme dear to my heart after working closely with @drwconvexity and @suna_said in the last year 🚀

English

607

137.8K

Keşfet

@chrsmrrs @dharshsky @fedzbar @gu_xiangming @iliaishacked @mmbronstein @TacoCohen @NatMachIntell