Ritvik Gupta

30 posts

Ritvik Gupta

Ritvik Gupta

@ritvikgupta199

ML at CMU | Maths at IITD'23

Katılım Kasım 2022
127 Takip Edilen94 Takipçiler
Sabitlenmiş Tweet
Ritvik Gupta
Ritvik Gupta@ritvikgupta199·
Check out some really interesting results from our study! Despite the wave of FMs in specialized domains such as genomics, satellite imaging, and time series, most of these specialized FMs are often no better--and sometimes even worse--than simple supervised baselines.
Misha Khodak@khodakmoments

🧵 on surprising revelations from our study of specialized foundation models (FMs beyond vision/text): after evaluating dozens of scientific & time series FMs we found that most weren’t even competitive with simple supervised models, some with as little as 513 parameters. 1/n

English
0
0
12
577
Ritvik Gupta retweetledi
Pratyush Maini
Pratyush Maini@pratyushmaini·
Excited about our NeurIPS'25 tutorial Data Privacy, Memorization & Copyright in GenAI with Cooper (co-founder, GenLaw) & Joe (represents OpenAI, Stability in all US copyright litigations) We bring together ML researchers, with those who understand its legal implications. Pls RT
Pratyush Maini tweet media
English
3
22
83
12.6K
Ritvik Gupta retweetledi
Misha Khodak
Misha Khodak@khodakmoments·
Happy to see a time series workshop @ NeurIPS 2025 motivated in part by our search for BERT moments in specialized foundation models (🧵here: x.com/khodakmoments/…)
Ambroise Odonnat@AmbroiseOdonnat

🚀 We are happy to organize the BERT²S workshop @NeurIPSConf 2025 on Recent Advances in Time Series Foundation Models. 🌐 berts-workshop.github.io 📜Submit by August 22 🎓Speakers and panelists: @ChenghaoLiu15 Mingsheng Long @zoe_piran @danielle_maddix @atalwalkar @qingsongedu

English
1
6
12
3.7K
Ritvik Gupta retweetledi
Ameet Talwalkar
Ameet Talwalkar@atalwalkar·
I have some news to share! @datadoghq is forming a new AI research lab, and I'm excited to announce that I've joined as Chief Scientist to lead this effort. Datadog has a great work culture, lots of data and compute, and is committed to open science and open sourcing. Our team is working on ambitious research areas grounded in real-world challenges in cloud observability and security, with three current areas of focus: 1. Observability Foundation Models for forecasting, anomaly detection, and multi-modal telemetry analysis (logs, metrics, traces, etc.). 2. Site Reliability Engineering (SRE) Agents to detect, diagnose, and resolve production incidents. 3. Production Code Repair Agents that leverage code, logs, and runtime data to identify and fix performance issues. On a personal note, I'm thrilled to work with Oli and Alexis again (we worked together in the early aughts before they co-founded Datadog). I’m also excited that Datadog, as part of its expansion in AI, is partnering with CMU (note: I will continue to work part-time at CMU and maintain my research activities). Datadog is actively hiring out of our NYC office!
Ameet Talwalkar tweet media
English
25
31
290
30.8K
Ritvik Gupta retweetledi
Junhong Shen
Junhong Shen@JunhongShen1·
🧵1/ Introducing ScribeAgent 🤖! Using extensive 𝗿𝗲𝗮𝗹-𝘄𝗼𝗿𝗹𝗱 𝘄𝗲𝗯 𝘄𝗼𝗿𝗸𝗳𝗹𝗼𝘄 𝗱𝗮𝘁𝗮, we at @mldcmu and @ScribeHow have adapted 𝗴𝗲𝗻𝗲𝗿𝗮𝗹-𝗽𝘂𝗿𝗽𝗼𝘀𝗲 𝗼𝗽𝗲𝗻-𝘀𝗼𝘂𝗿𝗰𝗲 𝗟𝗟𝗠𝘀 into 𝘀𝗽𝗲𝗰𝗶𝗮𝗹𝗶𝘇𝗲𝗱 𝘄𝗲𝗯 𝗮𝗴𝗲𝗻𝘁𝘀, outperforming traditional agents that rely on proprietary models like GPT-4 and o1-preview. Learn more about our breakthroughs and findings👇
Junhong Shen tweet media
English
1
24
97
14.3K
Ritvik Gupta retweetledi
Ameet Talwalkar
Ameet Talwalkar@atalwalkar·
Turns out that strong baselines still matter in 2024! While there have been a TON of specialized FMs introduced recently, we show that leading FMs in genomics, satellite imaging, and time series are generally no better (and often worse than) much simpler supervised baselines!
Misha Khodak@khodakmoments

🧵 on surprising revelations from our study of specialized foundation models (FMs beyond vision/text): after evaluating dozens of scientific & time series FMs we found that most weren’t even competitive with simple supervised models, some with as little as 513 parameters. 1/n

English
0
2
18
1.5K
Ritvik Gupta retweetledi
Arena.ai
Arena.ai@arena·
Which model is best for coding? @CopilotArena leaderboard is out! Our code completions leaderboard contains data collected over the last month, with >100K completions served and >10K votes! Let’s discuss our findings so far🧵
Arena.ai tweet media
English
17
77
531
136K
Ritvik Gupta retweetledi
Biology+AI Daily
Biology+AI Daily@BiologyAIDaily·
Specialized Foundation Models Struggle to Beat Traditional Supervised Learning Baselines 1. This study provides a rigorous comparison between specialized foundation models (FMs) and traditional supervised learning approaches, revealing that well-tuned CNNs and simple autoregressive models often outperform large-scale FMs across genomics, satellite imaging, and time series domains. 2. Through extensive benchmarking on over 30 tasks, including the Nucleotide Transformer benchmark and satellite imagery datasets, the researchers found that traditional models like Wide ResNet and UNet consistently match or surpass the performance of specialized FMs, despite the FMs’ extensive pretraining on massive datasets. 3. Key finding: In genomics tasks, DASHA, an optimized CNN workflow with architecture search, outperformed FMs on histone modification tasks, challenging the assumption that transformer-based FMs are universally superior. 4. The study introduces an open-source automated tuning workflow, DASHA, which efficiently combines neural architecture search and hyperparameter optimization, demonstrating that resource-intensive FMs are not always necessary for peak performance in specialized domains. 5. Results also show that a simple autoregressive model, with minimal computational demands, matched or outperformed large FMs on time series forecasting tasks. This underscores the importance of strong, well-tuned baselines in FM evaluations. @atalwalkar @JunhongShen1 @WenduoC @ritvikgupta199 📜Paper: openreview.net/pdf?id=wgBYYUj… #FoundationModels #MachineLearning #Bioinformatics #CNN #TimeSeries #SatelliteImaging
Biology+AI Daily tweet media
English
0
2
4
954
Ritvik Gupta retweetledi
Mononito Goswami
Mononito Goswami@MononitoGoswami·
📉Time Series FMs hold promise, but still struggle to outperform well-tuned statistical baselines👇 We need better evals, and more research to unlock their full potential🧠. Let's makes today's TSFMs the dumbest we'll every have!📈
Misha Khodak@khodakmoments

🧵 on surprising revelations from our study of specialized foundation models (FMs beyond vision/text): after evaluating dozens of scientific & time series FMs we found that most weren’t even competitive with simple supervised models, some with as little as 513 parameters. 1/n

English
0
4
9
1.4K
Ritvik Gupta retweetledi
Vibhhu Sharma
Vibhhu Sharma@VibhhuSharma·
(1/6) Recommender systems shape our digital experiences, filtering much of what we see online. But how should we measure their influence? We provide a unified causal framework to think through this question and develop metrics to audit recommender systems. arxiv.org/abs/2409.13210
English
1
3
12
6.1K
Ritvik Gupta retweetledi
Ananye Agarwal
Ananye Agarwal@anag004·
Want to scale RL with your shiny new GPU? 🚀 In our ICML24 Oral we find that RL algorithms hit a barrier when data is scaled up. Our new algorithm, SAPG, proposes a simple fix. It scales to 25k envs and solves hard tasks where PPO makes no progress. sapg-rl.github.io 1/n
English
8
82
414
64.4K
Ritvik Gupta
Ritvik Gupta@ritvikgupta199·
🧵 Excited to introduce our latest paper on Token Dependency-Aware Variational Information Bottleneck-based Pruning for LLMs (TVA-Prune), presented at the Efficient Systems for Foundational Models II Workshop at ICML 2024. 👇[1/n]
Ritvik Gupta tweet media
English
7
3
11
1.1K
Ritvik Gupta
Ritvik Gupta@ritvikgupta199·
@dutta_oshin By preserving token dependencies across model layers, TVA-Prune addresses a critical challenge in structured pruning, ensuring robust performance and efficient inference. 🔗🔬[7/n]
English
0
0
1
120
Ritvik Gupta
Ritvik Gupta@ritvikgupta199·
@dutta_oshin TVA-Prune adheres to user-defined sparsity criteria, making it a versatile and adaptable solution for various deployment scenarios, especially on resource-constrained devices. 🖥️✨[6/n]
English
0
0
1
120
Ritvik Gupta
Ritvik Gupta@ritvikgupta199·
@dutta_oshin We demonstrate the effectiveness of TVA-Prune on pre-trained LLMs, including variants of LLaMA-7B (also LLaMA 3 8B) and Mistral-7B. The results show superior performance and higher inference speedups compared to existing pruning methods. 📊🔍[5/n]
English
0
0
1
108
Ritvik Gupta
Ritvik Gupta@ritvikgupta199·
@dutta_oshin Our method introduces a post-pruning adaptation step that adjusts dimensions to match the block sizes of GPU tensor cores. This step ensures optimal parallelism during inference, leading to significant speedups. ⚙️⚡[4/n]
English
0
0
1
109
Ritvik Gupta
Ritvik Gupta@ritvikgupta199·
@dutta_oshin TVA-Prune efficiently removes redundant heads, intermediate dimensions, and global token representations, all on a single GPU. This structured approach ensures that the essential parts of the model are preserved, enhancing both efficiency and performance. 🚀🔧[3/n]
English
0
0
1
110