Pavankumar Vasu

44 posts

Pavankumar Vasu

Pavankumar Vasu

@PavankumarVasu

Katılım Temmuz 2013
127 Takip Edilen151 Takipçiler
Sabitlenmiş Tweet
Pavankumar Vasu
Pavankumar Vasu@PavankumarVasu·
Excited to share code & models for FastVLM — our blazing-fast Vision-Language Model appearing at #CVPR2025 Run it on-device with inference code optimized for Apple Silicon using #mlx. Code: github.com/apple/ml-fastv… Updated paper & results coming soon. Stay tuned! 👀
English
11
49
207
50K
Pavankumar Vasu retweetledi
Jiatao Gu
Jiatao Gu@thoma_gu·
(1/n) There’s a long-running debate on bringing representation learning into generative modeling—their latent spaces play different roles. 🚀🚀 We present FAE, a simple-yet-effective framework that bridges them with a single attention layer! Paper: huggingface.co/papers/2512.07…
Jiatao Gu tweet media
English
6
88
509
82.6K
Pavankumar Vasu retweetledi
Yizhe Zhang
Yizhe Zhang@YizheZhangNLP·
We use latent continuous thoughts for retrieval optimized via downstream NTP loss, unified under one LLM backbone. Since representations are shared, documents can be precomputed—eliminating 2-stage RAG. We match raw text performance but with a much shorter context budget. 📉🚀
Yizhe Zhang tweet media
Jie He@Jiehenlp

Happy to introduce my internship work at @Apple . We introduce CLaRa: Continuous Latent Reasoning, an end-to-end training framework that jointly trains retrieval and generation ! 🧠📦 🔗 arxiv.org/pdf/2511.18659… #RAG #LLMs #Retrieval #Reasoning #AI

English
1
8
35
6.7K
Pavankumar Vasu retweetledi
Jiatao Gu
Jiatao Gu@thoma_gu·
STARFlow gets an upgrade—it now works on videos🎥 We present STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows, a invertible, causal video generator built on autoregressive flows! 📄 Paper huggingface.co/papers/2511.20… 💻 Code github.com/apple/ml-starf… (1/10)
English
4
40
197
61.6K
Pavankumar Vasu retweetledi
Eran Malach
Eran Malach@EranMalach·
SSMs promised efficient language modeling for long context, but so far seem to underperform compared to Transformers in many settings. Our new work suggests that this is not a problem with SSMs, but with how we are currently using them. Arxiv: arxiv.org/pdf/2510.14826 🧵
Eran Malach tweet media
English
6
84
419
115.2K
Pavankumar Vasu retweetledi
Fartash Faghri
Fartash Faghri@FartashFg·
🚨While booking your travel for #NeurIPS2025, make sure to stay on Sunday, December 7 8am-5pm for CCFM Workshop (Continual and Compatible Foundation Model Updates). We have received exciting paper contributions and have an amazing lineup of speakers.
Fartash Faghri@FartashFg

Is your AI keeping Up with the world? Announcing #NeurIPS2025 CCFM Workshop: Continual and Compatible Foundation Model Updates When/Where: Dec. 6-7 San Diego Submission deadline: Aug. 22, 2025. (opening soon!) sites.google.com/view/ccfm-neur… #FoundationModels #ContinualLearning

English
0
3
21
3.9K
Pavankumar Vasu retweetledi
Xianhang Li
Xianhang Li@XianhangLi·
🤔 Ever thought a small teacher could train a student 6× larger that sets new SOTA in training efficiency and frozen evaluation performance for video representation learning? 🤔 Do we really need complex EMA-based self-distillation to prevent collapse, bringing unstable loss dynamics while offering little insight into representation quality? 🚨 In our new paper, we investigate these questions and propose SALT (Static-teacher Asymmetric Latent Training): a simple, scalable, and compute-efficient alternative for video representation learning. 📄 Rethinking JEPA: Compute-Efficient Video SSL with Frozen Teachers 🔗 arxiv.org/abs/2509.24317
Xianhang Li tweet media
English
7
73
459
39.1K
Pavankumar Vasu
Pavankumar Vasu@PavankumarVasu·
📢 Releasing MobileCLIP2 (TMLR Featured). Small embedding models that can power your multimodal RAG applications on resource constrained devices. Models are available on 🤗
Fartash Faghri@FartashFg

🚀Releasing MobileCLIP2 (TMLR Featured). MobileCLIP2-S4 matches acc of SigLIP-SO400M/14 while 2x smaller and surpasses DFN ViT-L/14 at 2.5x faster. Paper: arxiv.org/abs/2508.20691 Code: github.com/apple/ml-mobil… RayGen: github.com/apple/ml-mobil… 🤗huggingface.co/collections/ap… #Apple MLR

English
0
0
2
244
Pavankumar Vasu retweetledi
Fartash Faghri
Fartash Faghri@FartashFg·
🚨📅The submission deadline for #NeurIPS 2025 CCFM Workshop is just 8 days away on August 22. Get your papers in! Submit your work on Continual and Compatible Foundation Model Updates to the #NeurIPS 2025 CCFM Workshop. Learn more: sites.google.com/view/ccfm-neur…
Fartash Faghri@FartashFg

Is your AI keeping Up with the world? Announcing #NeurIPS2025 CCFM Workshop: Continual and Compatible Foundation Model Updates When/Where: Dec. 6-7 San Diego Submission deadline: Aug. 22, 2025. (opening soon!) sites.google.com/view/ccfm-neur… #FoundationModels #ContinualLearning

English
0
1
5
2K
Pavankumar Vasu retweetledi
Max Seitzer
Max Seitzer@maxseitzer·
Introducing DINOv3 🦕🦕🦕 A SotA-enabling vision foundation model, trained with pure self-supervised learning (SSL) at scale. High quality dense features, combining unprecedented semantic and geometric scene understanding. Three reasons why this matters…
Max Seitzer tweet media
English
12
138
1K
135K
Pavankumar Vasu retweetledi
Andi Marafioti
Andi Marafioti@andimarafioti·
🚀 We're thrilled to launch four new OCR datasets with 20M images: DoclingMatix, SynthFormulaNet, SynthCodeNet, and SynthChartNet. We used them train SmolDocling, our ultra‑compact (256M) full-page document conversion VLM with performance rivaling models up to 27× larger.
Andi Marafioti tweet media
English
5
77
551
30.4K
Pavankumar Vasu retweetledi
Andrea Santilli
Andrea Santilli@teelinsan·
Uncertainty quantification (UQ) is key for safe, reliable LLMs... but are we evaluating it correctly? 🚨 Our ACL2025 paper finds a hidden flaw: if both UQ methods and correctness metrics are biased by the same factor (e.g., response length), evaluations get systematically skewed
Andrea Santilli tweet media
English
1
17
48
4.2K
Pavankumar Vasu retweetledi
Fartash Faghri
Fartash Faghri@FartashFg·
📢Submissions are now open for #NeurIPS2025 CCFM workshop. Submission deadline: August 22, 2025, AoE. Website: sites.google.com/view/ccfm-neur… Call for papers: sites.google.com/view/ccfm-neur… Submission Link: openreview.net/group?id=NeurI…
Fartash Faghri@FartashFg

Is your AI keeping Up with the world? Announcing #NeurIPS2025 CCFM Workshop: Continual and Compatible Foundation Model Updates When/Where: Dec. 6-7 San Diego Submission deadline: Aug. 22, 2025. (opening soon!) sites.google.com/view/ccfm-neur… #FoundationModels #ContinualLearning

English
0
6
11
10.7K
Pavankumar Vasu retweetledi
Mustafa Shukor
Mustafa Shukor@MustafaShukor1·
We propose new scaling laws that predict the optimal data mixture, for pretraining LLMs, native multimodal models and large vision encoders ! Only running small-scale experiments is needed, and we can then extrapolate to large-scale ones. These laws allow 1/n 🧵
Mustafa Shukor tweet media
English
6
46
266
31.1K
Pavankumar Vasu retweetledi
Rin Metcalf Susa
Rin Metcalf Susa@RinMetcalfSusa·
📣 We are excited to present our work on inferring user preferences from writing samples at @icmlconf Poster Session 3 (Wed. 11:00AM - 1:30PM)! Come by to ✋ chat with us, 📄 learn about our method, and 💻 hear about our new interactive benchmark (🔗s below)!
English
1
3
7
489
Pavankumar Vasu retweetledi
Fartash Faghri
Fartash Faghri@FartashFg·
🚀Super excited to share TiC-LM (Oral at #ACL2025)! How to keep FMs up-to-date over months/years? We have a benchmark and lots of insights (arxiv.org/abs/2504.02107). Also organizing a related @NeurIPSConf 2025 workshop continual and compatible FMs (CCFM: sites.google.com/view/ccfm-neur…) Code/Models/Dataset: github.com/apple/ml-tic-lm Our prior work on TiC-CLIP: arxiv.org/abs/2310.16226 Thanks to @jeffwpli for his amazing work on DCLM and TiC-LM and other upcoming works during his internship at @Apple MLR. Thanks to everyone at @Apple MLR to help us do great research.
Jeffrey Li@jeffwpli

Excited to share TiC-LM (Oral at #ACL2025)! LLMs can become outdated ⏲️ and re-training from scratch is costly💰. Ideally, we'd keep reusing and updating models on newer data ♻️. We study continual training as 114 CC months are revealed one-at-a-time. arxiv.org/abs/2504.02107

English
0
2
11
852
Pavankumar Vasu retweetledi
Jiatao Gu
Jiatao Gu@thoma_gu·
I will be attending #CVPR2025 and presenting our latest research at Apple MLR! Specifically, I will present our highlight poster--world consistent video diffusion (cvpr.thecvf.com/virtual/2025/p…), and three workshop invited talks which includes our recent preprint ★STARFlow★! (0/n)
Tanishq Mathew Abraham, Ph.D.@iScienceLuvr

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis "We present STARFlow, a scalable generative model based on normalizing flows that achieves strong performance on high-resolution image synthesis"

English
2
23
86
28.7K
Pavankumar Vasu retweetledi
Ryan Hoque
Ryan Hoque@ryan_hoque·
Imitation learning has a data scarcity problem. Introducing EgoDex from Apple, the largest and most diverse dataset of dexterous human manipulation to date — 829 hours of egocentric video + paired 3D hand poses across 194 tasks. Now on arxiv: arxiv.org/abs/2505.11709 (1/4)
English
15
91
606
113.7K