Raphaël Millière

2.7K posts

Raphaël Millière banner
Raphaël Millière

Raphaël Millière

@raphaelmilliere

AI & Cognitive Science @UniofOxford @EthicsInAI Fellow @JesusOxford @raphaelmilliere.com on 🦋 Blog: https://t.co/2hJjfShFfr

Oxford, UK Katılım Mayıs 2016
2.9K Takip Edilen10.9K Takipçiler
Sabitlenmiş Tweet
Raphaël Millière
Raphaël Millière@raphaelmilliere·
Transformer-based neural networks achieve impressive performance on coding, math & reasoning tasks that require keeping track of variables and their values. But how can they do that without explicit memory? 📄 Our new ICML paper investigates this in a synthetic setting! 🧵 1/13
English
9
101
626
72.4K
Raphaël Millière retweetledi
Kanishka Misra 🌊
Kanishka Misra 🌊@kanishkamisra·
What is the interplay between representations learned from (language) surface forms alone, and those learned from more grounded evidence (e.g.,vision)? Excited to share new work understanding “Cross-modal taxonomic generalization” in (V)LMs 1/
Kanishka Misra 🌊 tweet media
English
1
13
50
4.6K
Raphaël Millière retweetledi
Doris Tsao
Doris Tsao@doristsao·
My thoughts on connectomics and upload: 1) there is zero question connectomes are invaluable, and we need to get them for mouse, monkey, and human 2) the human, or even monkey, connectome seems a long ways off given costs (roughly $1/neuron). The projectome (map of all the axons) seems eminently reachable and should be a top priority imho 3) but even having the full connectome would only tell you numbers of synapses, not actual synaptic weights, and the two can be hugely divergent (eg only 5% of synapses onto V1 layer 4 neurons come from thalamus, even though this is the major driving input) 4) given #2 & #3, I think we can get to upload in the sense of building a functionally equivalent organism much faster through understanding the algorithms of the primate brain than through blind copying 5) in putting together something as complex as the human brain we would definitely want to check that the various pieces work as we go, which we can only do if we understand these pieces 6) I don't think upload in the sense of blindly creating a digital copy is the path to the abundant transhumanist future--actual understanding of brain structures so we can intelligently interface with them, and emulate their function in code without copying all the details, is. All to say, we need functional understanding to go hand in hand with anatomical mapping!
Adam Marblestone@AdamMarblestone

You may have noticed some "holy $%@#" tweets on fly brain emulation. So is this a game-changer or a nothing-burger? Read on to find out...

English
23
48
304
58.4K
Raphaël Millière retweetledi
Harvey Lederman
Harvey Lederman@LedermanHarvey·
Can large language models *introspect*? In a new paper, @kmahowald and I study the MECHANISM of introspection in big open-source models. tldr: Models detect internal anomalies through DIRECT ACCESS, but don't know what the anomalies are. And they love to guess “apple” 🍎
Harvey Lederman tweet media
English
8
32
160
10.1K
Henry Shevlin
Henry Shevlin@dioscuri·
In this country, first you get the money, then you get the RAM, then you get the Qwen2.5-72B
Henry Shevlin tweet media
English
4
2
52
2.5K
Andrew Lampinen
Andrew Lampinen@AndrewLampinen·
After 5.5 years (or 7 or 9, counting internships), today was my last day at Google/DeepMind. When I was in London recently, I walked through the two floors that were (most of) DeepMind when I first joined, and thought about how much the company and field have changed since then.
Andrew Lampinen tweet media
English
38
3
700
48.6K
Raphaël Millière retweetledi
babyLM
babyLM@babyLMchallenge·
1/ 👶 BabyLM is back at EMNLP 2026! We are excited to announce that the 4th BabyLM Challenge & Workshop will once again bring together researchers interested in sample-efficient, developmentally plausible language modeling. @emnlpmeeting More in🧵
babyLM tweet media
English
1
9
15
1.7K
Raphaël Millière retweetledi
Michael Hu
Michael Hu@michahu8·
fyi, @babyLMchallenge has been doing this for 4 years now. some interesting ideas from our past competitions for folks to consider: 1. mixing causal and masked LM objectives (GPT-BERT) 2. mixture of experts as a way to better model human cognition
Samip@industriaalist

1/ Introducing NanoGPT Slowrun 🐢: an open repo for state-of-the-art data-efficient learning algorithms. It's built for the crazy ideas that speedruns filter out -- expensive optimizers, heavy regularization, SGD replacements like evolutionary search.

English
1
8
28
3.9K
Raphaël Millière retweetledi
Hokin Deng
Hokin Deng@DengHokin·
#VideoReason We are open-sourcing the entire VBVR stack to speed-up the arrival of video reasoning as the next fundamental paradigm of intelligence - 150+ synthetic generators - 1 million training clips - Cloud-scale data factory - Unified EvalKit - 100 rule-based evaluators - Strong baseline model Checkout at video-reason.com
English
19
64
217
49.6K
Raphaël Millière
Raphaël Millière@raphaelmilliere·
Great work! "These findings suggest that modern video models do not use factorized representations of physical variables like a classical physics engine. Instead, they use a distributed representation that is nonetheless sufficient for making physical predictions."
Sonia Joseph@soniajoseph_

Today we release a new paper from Meta @AIatMeta: "Interpreting Physics in Video World Models," one of the first interpretability studies of video encoders. V-JEPA 2 shows rich, counterintuitive behaviors, including brain-like population codes and high-dimensional steering.

English
1
0
23
3.1K
Raphaël Millière retweetledi
McGovern Institute
McGovern Institute@mcgovernmit·
How does the brain know which neurons to adjust during learning in order to optimize behavior? MIT researchers discovered that brains can use cell-by-cell error signals to do this — surprisingly similar to how AI systems are trained via backpropagation. mcgovern.mit.edu/2026/02/25/neu…
English
3
84
374
55.5K
Benno Krojer
Benno Krojer@benno_krojer·
You can now "pip install latentlens" 🔨 It comes with: * pre-computed embeddings for several popular LLMs and VLMs * a txt file with sentences describing WordNet concepts, which we recommend as a standard corpus to get embeddings from * ... Try it out and let us know what we can improve!
Benno Krojer tweet media
Benno Krojer@benno_krojer

🚨New paper Are visual tokens going into an LLM interpretable 🤔 Existing methods (e.g. logit lens) and assumptions would lead you to think “not much”... We propose LatentLens and show that most visual tokens are interpretable across *all* layers 💡 Details 🧵

English
6
16
59
11.8K
Raphaël Millière
Raphaël Millière@raphaelmilliere·
@peligrietzer @littmath my original point is that huge autoregressive transformers trained on lots of data aren't necessary and perhaps not even straightforwardly sufficient (?) for tasks like pro-level go-playing, though I'm not sure what that says about how good they can get at advanced maths
English
1
0
1
45
Raphaël Millière
Raphaël Millière@raphaelmilliere·
@peligrietzer @littmath Right - my mental model is that RL helps by concentrating LLMs' probability mass on good/long reasoning trajectories, but they're still doing inherently serial computation unless you graft on "Deep Think"-like parallel sampling strategies so not sure how far to take this analogy
English
2
0
2
48
Raphaël Millière retweetledi
Brown NLP
Brown NLP@Brown_NLP·
LUNAR Lab is looking for a postdoc to work on understanding and interpreting reasoning in LLMs and humans, broadly construed. The position is funded by Schmidt Sciences and is for at least 18 months, with the likely option to extend. Apply here! forms.gle/CrmrCzun79G9Ca…
English
0
7
37
4.8K