Michael Eli Sander

213 posts

Michael Eli Sander

@m_e_sander

Research Scientist at Google DeepMind

Paris Katılım Şubat 2021

192 Takip Edilen2.4K Takipçiler

Michael Eli Sander retweetledi

Statistics Papers@StatsPapers·30 Oca

Clustering in Deep Stochastic Transformers Lev Fedorov, Michaël E. Sander, Romuald Elie, Pierre Marion, Mathieu Laurière arxiv.org/abs/2601.21942 [𝚜𝚝𝚊𝚝.𝙼𝙻 𝚌𝚜.𝙻𝙶]

Français

596

Michael Eli Sander retweetledi

Mathieu Blondel@mblondel_ml·7 Şub

Distillation is becoming a major paradigm for training LLMs but its success and failure modes remain quite mysterious. Our paper introduces the phenomenon of "teacher hacking" and studies how to mitigate it. arxiv.org/abs/2502.02671 More details in the thread below.

Daniil Tiapkin@dtiapkin

1/ If you’re familiar with RLHF, you likely heard of reward hacking —where over-optimizing the imperfect reward model leads to unintended behaviors. But what about teacher hacking in knowledge distillation: can the teacher be hacked, like rewards in RLHF?

English

6.3K

Michael Eli Sander retweetledi

Mathieu Blondel@mblondel_ml·31 Oca

Really proud of these two companion papers by our team at GDM: 1) Joint Learning of Energy-based Models and their Partition Function arxiv.org/abs/2501.18528 2) Loss Functions and Operators Generated by f-Divergences arxiv.org/abs/2501.18537 A thread.

English

167

24K

Michael Eli Sander retweetledi

Tom Sander (Ph.D.)@RednasTom·10 Ara

I am in NeurIPS week :) Friday, Presenting our spotlight work: Watermarking Makes LLMs Radioactive ☢️ (arxiv.org/abs/2402.14904) Sunday, speaking at the image watermarking workshop about our latest Watermark Anything work (arxiv.org/abs/2411.07231) DM me if you’d like to chat :)

English

839

Michael Eli Sander retweetledi

Sarah Perrin@sarah_perrin_·5 Ara

♟️Mastering Board Games by External and Internal Planning with Language Models♟️ I'm happy to finally share storage.googleapis.com/deepmind-media… TL;DR: In chess, our planning agents effectively reach grandmaster-level strength with a comparable search budget to that of human players!

Nenad Tomasev@weballergy

I'm excited to share a new paper: "Mastering Board Games by External and Internal Planning with Language Models" storage.googleapis.com/deepmind-media… (also soon to be up on Arxiv, once it's been processed there)

English

3.6K

Michael Eli Sander retweetledi

Sibylle Marcotte@SibylleMarcotte·20 Kas

Merci pour l’opportunité d’avoir échangé sur mes recherches et mes expériences ! Merci à mes directeurs de thèse @gabrielpeyre et @RemiGribonval pour votre supervision 😊

Centre Inria de Lyon@Inria_lyon

📽️On a interviewé @SibylleMarcotte , doctorante @ENS_ULM, membre de l'équipe Ockham, lauréate 🏆du prix Jeunes Talents France 2024 L'Oréal - @UNESCO #ForWomenInScience ▶️ses recherches et ses conseils pour les filles souhaitant devenir #scientifiques :) @UnivLyon1 @ENSdeLyon

Français

Michael Eli Sander retweetledi

Tom Sander (Ph.D.)@RednasTom·4 Kas

☢️ Some news about radioactivity ☢️ - We got a Spotlight at Neurips! 🥳 and we will be in Vancouver with @pierrefdz to present! - We have just released our code for radioactivity detection at github.com/facebookresear….

Tom Sander (Ph.D.)@RednasTom

OpenAI may secretly know that you trained on GPT outputs! In our work "Watermarking Makes Language Models Radioactive", we show that training on watermarked text can be easily spotted ☢️ Paper: arxiv.org/abs/2402.14904 @pierrefdz @AIatMeta @Polytechnique @Inria

English

2.5K

Michael Eli Sander retweetledi

Tom Sander (Ph.D.)@RednasTom·12 Kas

🔒Image watermarking is promising for digital content protection. But images often undergo many modifications—spliced or altered by AI. Today at @AIatMeta, we released Watermark Anything that answers not only "where does the image come from," but "what part comes from where." 🧵

English

Michael Eli Sander retweetledi

Fabian Pedregosa@fpedregosa·17 Eki

Six years at Google today! 🎉 From 🇨🇦 to 🇨🇭, optimizing everything in sight. Grateful for the incredible journey and amazing colleagues!

English

116

9.4K

Michael Eli Sander retweetledi

Sibylle Marcotte@SibylleMarcotte·8 Eki

🏆Didn't get the Physics Nobel Prize this year, but really excited to share that I've been named one of the #FWIS2024 @FondationLOreal-@UNESCO French Young Talents alongside 34 amazing young researchers! This award recognizes my research on deep learning theory #WomenInScience 👩‍💻

École normale supérieure | PSL@ENS_ULM

#FWIS2024 🎖️@SibylleMarcotte, doctorante au département #mathématiques et applications de l'ENS @psl_univ, figure parmi les lauréates du Prix Jeunes Talents France 2024 @FondationLOreal @UNESCO #ForWomenInScience @AcadSciences @4womeninscience Félicitations à elle !!! 👏

English

326

35.7K

Michael Eli Sander@m_e_sander·7 Eki

🥳🥳 Thrilled to share that I've joined Google DeepMind as a Research Scientist. Super excited for what's to come!

English

113

3.5K

199.8K

Michael Eli Sander retweetledi

Pierre Marion@PierreMari0n·4 Eki

🚨New paper alert🚨: arxiv.org/abs/2410.01537 How does Transformer retrieve information which is sparsely concentrated in few tokens? e.g., the label can change by flipping a single word. To explain this, we introduce a new statistical task, and show that attention solves it ⬇️

English

Michael Eli Sander retweetledi

Gérard Biau@gerardbiau·25 Eyl

arxiv.org/abs/2409.13786

ZXX

133

10.9K

Michael Eli Sander retweetledi

Jules Samaran@JulesSamaran·18 Eyl

After a very constructive back and forth with editors and reviewers of @NatureComms, scConfluence has now been published @LauCan88 @gabrielpeyre ! I'll present it this afternoon at the poster session of @ECCBinfo (P296) Published version: nature.com/articles/s4146…

Jules Samaran@JulesSamaran

🥳 I’m very happy to announce our preprint biorxiv.org/content/10.110… ! scConfluence combines uncoupled autoencoders with Inverse Optimal Transport to integrate unpaired multimodal single-cell data in shared low dimensional latent space. @LauCan88 @gabrielpeyre

English

6.3K

Michael Eli Sander retweetledi

Gabriel Peyré@gabrielpeyre·5 Ağu

"Transformers are Universal In-context Learners": in this paper, we show that deep transformers with a fixed embedding dimension are universal approximators for an arbitrarily large number of tokens. arxiv.org/abs/2408.01367

English

311

1.5K

125.2K

Michael Eli Sander retweetledi

Geert-Jan Huizing@gjhuizing·29 Tem

🎉 New preprint! biorxiv.org/content/10.110… STORIES learns a differentiation potential from spatial transcriptomics profiled at several time points using Fused Gromov-Wasserstein, an extension of Optimal Transport. @gabrielpeyre @LauCan88

English

10.4K

Michael Eli Sander retweetledi

Jérémie Kalfon@jkobject·29 Tem

🚨🚨 AI in Bio release 🧬 Very happy to share my work on a Large Cell Model for Gene Network Inference. It is for now just a preprint and more is to come. We are asking the question: “What can 50M cells tell us about gene networks?” ❓Behind it, other questions arose like: “how can we best learn networks for scRNAseq data?”, “How would we assess them?” “What are foundation models in biology actually learning about the cell and its mechanism?” We try to partially answer these questions and present a new model: scPRINT 💫. 🏃 -> it is for now a pre-print and more is to come but here are some of our results: scPRINT is a transformer model trained on 50M cells 🦠 from the cellxgene database, it has novel expression encoding and decoding schemes and new pre-training methodologies 🤖. We propose to use the specificity of scRNAseq data and define a multi task pre-training composed of expression denoising, bottleneck learning and classification. We also propose a new hierarchical classification method to work with the rich hierarchical ontologies used to label cells in cellxgene. (1/2) x.com/razoralign/sta…

English

15.5K

Michael Eli Sander retweetledi

Mathieu Blondel@mblondel_ml·25 Tem

We uploaded a v2 of our book draft "The Elements of Differentiable Programming" with many improvements (~70 pages of new content) and a new chapter on differentiable data structures (lists and dictionaries). arxiv.org/abs/2403.14606

English

123

659

70.5K

Michael Eli Sander@m_e_sander·25 Tem

Come and see us today at 1:30 pm at spot #411 for our poster session !!

Michael Eli Sander@m_e_sander

🚨🚨New ICML 2024 Paper: arxiv.org/abs/2402.05787 How do Transformers perform In-Context Autoregressive Learning? We investigate how causal Transformers learn simple autoregressive processes or order 1. with @RGiryes, @btreetaiji, @mblondel_ml and @gabrielpeyre 🙏

English

14.9K

Michael Eli Sander retweetledi

Tom Sander (Ph.D.)@RednasTom·23 Tem

You didn’t believe in Differential Private training for foundation models? We achieved the same performance as non-private MAE trained on the same dataset, but with rigorous DP. Code is released: github.com/facebookresear…. Presenting tomorrow at ICML, 11:30AM poster, #2313

English

2.8K

Keşfet

@gabrielpeyre @RemiGribonval @pierrefdz @AIatMeta @FondationLOreal @UNESCO @NatureComms @ECCBinfo