Mathurin Videau

35 posts

Mathurin Videau

@mathuvu_

Katılım Ekim 2024

76 Takip Edilen111 Takipçiler

Mathurin Videau retweetledi

Basile Terver@BasileTerv987·12 Oca

My first PhD paper is out! 🎓 "What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?" tl:dr: JEPA-WMs for robotics: learn dynamics on top of visual encoders, optimize actions towards goal 👇 w/ @JimmyTYYang1, Jean Ponce, @AdrienBardes, @ylecun

English

110

918

79.3K

Mathurin Videau retweetledi

Théophane Vallaeys@webalorn·7 Eki

🎆 Can we achieve high compression rate for images in autoencoders without compromising quality and decoding speed? ⚡️ We introduce SSDD (Single-Step Diffusion Decoder), achieving improvements on both fonts, setting new state-of-the-art on image reconstruction. 👇 1/N

English

168

10.2K

Mathurin Videau retweetledi

Joséphine Raugel@JRaugel·1 Eyl

Very pleased to share our latest study!

Jean-Rémi King@JeanRemiKing

Can AI help understand how the brain learns to see the world? Our latest study, led by @JRaugel from FAIR at @AIatMeta and @ENS_ULM, is now out! 📄 arxiv.org/pdf/2508.18226 🧵 A thread:

English

517

50.5K

Mathurin Videau retweetledi

Federico Baldassarre@BaldassarreFe·14 Ağu

Say hello to DINOv3 🦖🦖🦖 A major release that raises the bar of self-supervised vision foundation models. With stunning high-resolution dense features, it’s a game-changer for vision tasks! We scaled model size and training data, but here's what makes it special 👇

English

253

1.9K

223.8K

Mathurin Videau retweetledi

AI at Meta@AIatMeta·14 Ağu

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks. Learn more about DINOv3 here: ai.meta.com/blog/dinov3-se…

English

346

746

4.5K

896.4K

Mathurin Videau retweetledi

Wassim (Wes) Bouaziz@_Vassim·24 Haz

🚨New AI Security paper alert: Winter Soldier 🥶🚨 In our last paper, we show: -how to backdoor a LM _without_ training it on the backdoor behavior -use that to detect if a black-box LM has been trained on your protected data Yes, Indirect data poisoning is real and powerful!

English

6.6K

Mathurin Videau retweetledi

Nikola Jovanović@ni_jovanovic·23 Haz

There's a lot of work now on LLM watermarking. But can we extend this to transformers trained for autoregressive image generation? Yes, but it's not straightforward 🧵(1/10)

GIF

English

315

48.5K

Mathurin Videau retweetledi

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·18 Haz

From Bytes to Ideas: Language Modeling with Autoregressive U-Nets "Byte Pair Encoding (BPE) and similar schemes split text once, build a static vocabulary, and leave the model stuck with that choice. We relax this rigidity by introducing an autoregressive U-Net that learns to embed its own tokens as it trains. The network reads raw bytes, pools them into words, then pairs of words, then up to 4 words, giving it a multi-scale view of the sequence. At deeper stages, the model must predict further into the future -- anticipating the next few words rather than the next byte -- so deeper stages focus on broader semantic patterns while earlier stages handle fine details."

Tanishq Mathew Abraham, Ph.D. tweet media

English

509

56.9K

Mathurin Videau retweetledi

elvis@omarsar0·18 Haz

From Bytes to Ideas Avoids using predefined vocabs and memory-heavy embedding tables. Instead, it uses Autoregressive U-Nets to embed information directly from raw bytes. This is huge! Enables infinite vocab size and more. More in my notes below:

English

190

45K

Mathurin Videau retweetledi

Aran Komatsuzaki@arankomatsuzaki·18 Haz

From Bytes to Ideas: Language Modeling with Autoregressive U-Nets Presents an autoregressive U-Net that processes raw bytes and learns hierarchical token representation Matches strong BPE baselines, with deeper hierarchies demonstrating promising scaling trends

English

356

59.7K

Mathurin Videau@mathuvu_·18 Haz

Links to paper and code. Please enjoy! 📄 arxiv.org/abs/2506.14761 🛠 github.com/facebookresear… 8/8

English

625

Mathurin Videau@mathuvu_·18 Haz

In future work, we plan to make AU-Net hierarchies deeper so models think at even more abstract levels. We only want a portion of the model spending time on syntax and spelling, so most of the compute can be dedicated to thinking about the next idea instead of the next token. 7/8

English

655

Mathurin Videau@mathuvu_·18 Haz

We present an Autoregressive U-Net that incorporates tokenization inside the model, pooling raw bytes into words then word-groups. AU-Net focuses most of its compute on building latent vectors that correspond to larger units of meaning. Joint work with @byoubii 1/8

English

191

42.9K

Mathurin Videau retweetledi

Krunoslav Lehman Pavasovic@KrunoLehman·9 Nis

1/ Happy to share my first accepted paper as a PhD student at @Meta and @ENS_ULM which I will present at @iclr_conf: 📚 Our work proposes difFOCI, a novel rank-based objective for ✨better feature learning✨ In collab with David Lopez-Paz, @GiulioBiroli and @leventsagun!

English

Mathurin Videau retweetledi

TimDarcet@TimDarcet·14 Şub

Want strong SSL, but not the complexity of DINOv2? CAPI: Cluster and Predict Latents Patches for Improved Masked Image Modeling.

English

107

606

160.7K

Keşfet

@JimmyTYYang1 @AdrienBardes @ylecun @byoubii @Meta @ENS_ULM @iclr_conf @GiulioBiroli