Mathurin Videau

35 posts

Mathurin Videau

Mathurin Videau

@mathuvu_

Katılım Ekim 2024
76 Takip Edilen111 Takipçiler
Mathurin Videau retweetledi
Basile Terver
Basile Terver@BasileTerv987·
My first PhD paper is out! 🎓 "What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?" tl:dr: JEPA-WMs for robotics: learn dynamics on top of visual encoders, optimize actions towards goal 👇 w/ @JimmyTYYang1, Jean Ponce, @AdrienBardes, @ylecun
English
13
110
918
79.3K
Mathurin Videau retweetledi
Théophane Vallaeys
Théophane Vallaeys@webalorn·
🎆 Can we achieve high compression rate for images in autoencoders without compromising quality and decoding speed? ⚡️ We introduce SSDD (Single-Step Diffusion Decoder), achieving improvements on both fonts, setting new state-of-the-art on image reconstruction. 👇 1/N
Théophane Vallaeys tweet mediaThéophane Vallaeys tweet media
English
5
34
168
10.2K
Mathurin Videau retweetledi
Federico Baldassarre
Federico Baldassarre@BaldassarreFe·
Say hello to DINOv3 🦖🦖🦖 A major release that raises the bar of self-supervised vision foundation models. With stunning high-resolution dense features, it’s a game-changer for vision tasks! We scaled model size and training data, but here's what makes it special 👇
Federico Baldassarre tweet mediaFederico Baldassarre tweet mediaFederico Baldassarre tweet mediaFederico Baldassarre tweet media
English
40
253
1.9K
223.8K
Mathurin Videau retweetledi
AI at Meta
AI at Meta@AIatMeta·
Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks. Learn more about DINOv3 here: ai.meta.com/blog/dinov3-se…
English
346
746
4.5K
896.4K
Mathurin Videau retweetledi
Wassim (Wes) Bouaziz
Wassim (Wes) Bouaziz@_Vassim·
🚨New AI Security paper alert: Winter Soldier 🥶🚨 In our last paper, we show: -how to backdoor a LM _without_ training it on the backdoor behavior -use that to detect if a black-box LM has been trained on your protected data Yes, Indirect data poisoning is real and powerful!
Wassim (Wes) Bouaziz tweet media
English
1
19
51
6.6K
Mathurin Videau retweetledi
Nikola Jovanović
Nikola Jovanović@ni_jovanovic·
There's a lot of work now on LLM watermarking. But can we extend this to transformers trained for autoregressive image generation? Yes, but it's not straightforward 🧵(1/10)
GIF
English
5
52
315
48.5K
Mathurin Videau retweetledi
Tanishq Mathew Abraham, Ph.D.
Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets "Byte Pair Encoding (BPE) and similar schemes split text once, build a static vocabulary, and leave the model stuck with that choice. We relax this rigidity by introducing an autoregressive U-Net that learns to embed its own tokens as it trains. The network reads raw bytes, pools them into words, then pairs of words, then up to 4 words, giving it a multi-scale view of the sequence. At deeper stages, the model must predict further into the future -- anticipating the next few words rather than the next byte -- so deeper stages focus on broader semantic patterns while earlier stages handle fine details."
Tanishq Mathew Abraham, Ph.D. tweet media
English
11
77
509
56.9K
Mathurin Videau retweetledi
elvis
elvis@omarsar0·
From Bytes to Ideas Avoids using predefined vocabs and memory-heavy embedding tables. Instead, it uses Autoregressive U-Nets to embed information directly from raw bytes. This is huge! Enables infinite vocab size and more. More in my notes below:
elvis tweet media
English
13
37
190
45K
Mathurin Videau retweetledi
Aran Komatsuzaki
Aran Komatsuzaki@arankomatsuzaki·
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets Presents an autoregressive U-Net that processes raw bytes and learns hierarchical token representation Matches strong BPE baselines, with deeper hierarchies demonstrating promising scaling trends
Aran Komatsuzaki tweet media
English
3
53
356
59.7K
Mathurin Videau
Mathurin Videau@mathuvu_·
In future work, we plan to make AU-Net hierarchies deeper so models think at even more abstract levels. We only want a portion of the model spending time on syntax and spelling, so most of the compute can be dedicated to thinking about the next idea instead of the next token. 7/8
English
1
3
3
655
Mathurin Videau
Mathurin Videau@mathuvu_·
We present an Autoregressive U-Net that incorporates tokenization inside the model, pooling raw bytes into words then word-groups. AU-Net focuses most of its compute on building latent vectors that correspond to larger units of meaning. Joint work with @byoubii 1/8
Mathurin Videau tweet media
English
14
46
191
42.9K
Mathurin Videau retweetledi
TimDarcet
TimDarcet@TimDarcet·
Want strong SSL, but not the complexity of DINOv2? CAPI: Cluster and Predict Latents Patches for Improved Masked Image Modeling.
TimDarcet tweet media
English
22
107
606
160.7K