Tiago Pimentel

1.1K posts

Tiago Pimentel

@tpimentelms

Postdoc at @ETH_en. Formerly, PhD student at @Cambridge_Uni.

Brasília, Brazil Entrou em Kasım 2009

312 Seguindo1.8K Seguidores

Tweet fixado

Tiago Pimentel@tpimentelms·20 Kas

Tokenisers are a vital part of LLMs, but how hard is it to find an optimal one? 🤔 Considering arbitrarily large alphabets, prior work showed this is NP-hard. But what if we use bytes instead? Or strings like a, aa, aaa, ...? In our new paper, we show this is still hard, NP-hard!

English

5.6K

Tiago Pimentel retweetou

Yonatan Belinkov@boknilev·5d

Funding opportunity for PhD students for 4 month visits in Israeli universities. Contact if you're interested in an internship with me. Focus areas: Interpretability and controllability of LLMs, AI safety, multi-agent communication, AI for Science. azrielifoundation.org/fellows/visiti…

English

7.2K

Tiago Pimentel retweetou

Marius Mosbach@mariusmosbach·6d

Go work with Yanai. Visiting him during my PhD fundamentally changed my research career for the better. 🙌

Yanai Elazar@yanaiela

Are you interested in interning with me and my lab? A unique opportunity for a 4-month research stay, with generous funding as an Azrieli visiting PhD fellow! DM me if you're interested. azrielifoundation.org/fellows/visiti…

English

Tiago Pimentel@tpimentelms·25 Mar

@miniapeur After rejecting a request, most conferences give you an option to request a reduced load. This shows up in OpenReview itself. I often do that, since I review/AC for way more conferences than I should 😅

English

564

Mathieu@miniapeur·25 Mar

As far as I know, we cannot choose the exact number of papers we review at top conferences. This means I review much less than I would actually like to. I would sincerely be happy to review two papers for each top conference (ICLR, NeurIPS, ICML, AISTATS, etc.), but being asked to review five to seven papers per conference is far too much.

English

9.4K

Tiago Pimentel@tpimentelms·24 Mar

@mariusmosbach @icmlconf As an AC, I really dislike (purely) LLM-generated reviews. They have very little content, but still extend over a thousand words, which takes a long time to read. At least without LLMs, lazy reviewers are succinct 😅

English

670

Marius Mosbach@mariusmosbach·24 Mar

💭 @icmlconf review content quality is really concerning... Seems like we have reached a stage where not using LLMs makes reviews even worse (and lazy) than heavily LLM assisted ones.. Don't have a good solution but this really feels like reviewing has to change fundamentally..

English

5.7K

Tiago Pimentel@tpimentelms·18 Mar

If you wanna learn more about UnigramLM, check out the "missing manual" Clara wrote about it! :)

Clara Isabel Meister@clara__meister

Happy to announce that the blog post I wrote on the UnigramLM tokenization algorithm will be appearing in the ICLR 2026 blog post track! Added lots more examples and intuitions since the first version. Hope some of you find it to be a useful resource :) cimeister.github.io/blog/unigramlm/

English

Tiago Pimentel retweetou

Valentin Hofmann@vjhofmann·3 Mar

📢 Life update 📢 After a wonderful time at @allen_ai, I've joined @CisLmu at @LMU_Muenchen as a tenure-track assistant professor in NLP. Thrilled to be back in Europe and to start a lab in Munich's flourishing AI ecosystem! 🎉

English

320

16.3K

Tiago Pimentel retweetou

Dimitri von Rütte@dvruette·27 Şub

there, I said it. diffusion LLMs are the future! I'll be back in a couple of years to collect my "I told you so" award.

English

104

905

212.5K

Tiago Pimentel retweetou

Clara Isabel Meister@clara__meister·24 Şub

What if your tokenizer could tell you your text's language? We're excited to introduce 𝗨𝗻𝗶𝗟𝗜𝗗: a simple, data-efficient method for language identification (LID) that builds on the UnigramLM tokenization algorithm. 📄 arxiv.org/abs/2602.17655

English

4.1K

Tiago Pimentel@tpimentelms·24 Şub

@rocketalignment @universeinanegg You should check out @pietro_lesci's work :) Not exactly "mech interp", but close enough! aclanthology.org/2024.acl-long.… aclanthology.org/2025.acl-long.…

English

🚀 Rocket Is Not In Rio@rocketalignment·24 Şub

@universeinanegg Econometrics for mech interp 🗣️🗣️🗣️

English

244

Ari Holtzman@universeinanegg·24 Şub

The current standard in MechInterp appears to be: X is necessary to Y, if ablating it makes Y vanish X is sufficient for Y, if you insert it and Y appears where it wouldn't have previously But I feel like most such evidence doesn't capture selecitivity: does X effect other stuff? Is Y just downstream of that other stuff? Many evals include something like "does model perplexity go up on unrelated data" or "does the model still do well on MMLU". This seems entirely insufficient (no pun intended) Where is the leakage? One issue is that behaviors (i.e. Ys) are often poorly defined enough that we are really seeing necessity or sufficiency to some subset of Y, often not the same subset for necessity and sufficiency. How can we clean this all up? Surely there's someone more cogent who's written about this issue in MechInterp? (Plenty of Philosophers of Science worry about this but LLMs are a tad strange because of how easy they are to manipulate while preserving the original...)

English

106

5.4K

Tiago Pimentel retweetou

Christopher Potts@ChrisGPotts·29 Oca

This is such a thoughtful post – thank you! The framing reveals that I must be out of step with the discourse. I have never thought of the possibility of non-linear features as an objection to mech-interp! We used mech-interp tools to find and characterize the onions that you so thoughtfully discuss. The closest I have come to this objection is Sutter et al. 2025 (arxiv.org/abs/2507.08802), which is certainly important but which I read as a call to action rather than an objection to the mech-interp project. Your post clarified for me that the crux of all of this (for me) is that if magnitude superposition occurs at all, in present or future models, I want to be sure we have tools for reliably detecting it. (Your arguments already help me see better why we didn't find onions in Transformers, but rather only in RNNs – the Transformer has more representationally efficient ways of storing position!)

English

7.8K

Tiago Pimentel@tpimentelms·19 Şub

checkout our new paper about the superficial alignment hypothesis :) we use algorithmic information theory to formalise this hypothesis, we unify prior work on the topic, and show how post-training affects it! follow @tvergarabrowne for more great work like this in the future!

tom@tvergarabrowne

first paper of the phd 🥳 the Superficial Alignment Hypothesis (SAH) argues that pre-training adds most of the knowledge to a model, and post-training merely surfaces it. however, this hypothesis has lacked a precise definition. we fix this.

English

2.8K

Tiago Pimentel retweetou

Marius Mosbach@mariusmosbach·18 Şub

Check out our new preprint on the superficial alignment hypothesis (SAH). 👇 We operationalize the SAH via the length of the shortest program that achieves a certain performance on a task, unifying previous views on the SAH and showing how post-training affects "superficiality".

tom@tvergarabrowne

English

743

Tiago Pimentel retweetou

tom@tvergarabrowne·18 Şub

English

235

31.5K

Tiago Pimentel@tpimentelms·14 Şub

Looking for an emergency reviewer for an ACL submission about reasoning in large vision–language models 😁 please DM me if you are interested in doing it! The theoretical deadline is *very* short: today (February 14th) AoE, but ok if 24h or even 48h late!

English

577

Tiago Pimentel@tpimentelms·13 Şub

I am looking for an emergency reviewer for an ACL submission about RoPE 😁 Please DM me if you wanna do it! The deadline is quite short, though, already on February 14th AoE!

English

813

Tiago Pimentel retweetou

Dimitri von Rütte@dvruette·16 Ara

🚨 NEW PAPER! (this is a big one; 3B and 10B models included) Masked diffusion LLMs are getting a lot of attention. They outperform other diffusion types (such as uniform diffusion) at small scales. But what if I told you that uniform diffusion actually scales better? 🧵👇

English

214

53.7K

Tiago Pimentel retweetou

Aryaman Arora@aryaman2020·7 Ara

the major flaw of “pragmatic interpretability” imo: the problems that this approach wants to work on are the same problems posttraining researchers work on, except posttraining researchers don’t have to restrict themselves to specific research methodologies (e.g. interp)

English

112

19.4K

Tiago Pimentel retweetou

Valentina Pyatkin @ ICLR 🇧🇷@valentina__py·11 Ara

I started a part-time role at @ETH_AI_Center, mentoring students and working on post-training for the Swiss AI Initiative! 🤩Looking forward to working with interesting people like @a_yukh @ImanolSchlag @Noah_Xu_ @nathanrchn @ArnoutDevos If you are a student at ETHZ or EPFL looking for a semester or thesis project on post-training of LLMs, please reach out!

English

191

10.6K

Descobrir

@miniapeur @mariusmosbach @icmlconf @allen_ai @CisLmu @LMU_Muenchen @rocketalignment @universeinanegg