Rishika Bhagwatkar @ NeurIPS ✈️

99 posts

Rishika Bhagwatkar @ NeurIPS ✈️

Rishika Bhagwatkar @ NeurIPS ✈️

@rishika2110

MSc in CS at Mila Quebec and UdeM

Montréal, Québec Katılım Mart 2021
327 Takip Edilen169 Takipçiler
Sabitlenmiş Tweet
Rishika Bhagwatkar @ NeurIPS ✈️
🚨 New Paper Alert! 🚀 Excited to introduce CAVE: A benchmark for detecting and explaining commonsense anomalies in real-world visual environments! Accepted at #EMNLP2025 Main Conference! 🎉 📍Join us at our poster on Nov 5, 16:30-18:00 in Hall C.
Rishika Bhagwatkar @ NeurIPS ✈️ tweet media
English
1
7
14
845
Rishika Bhagwatkar @ NeurIPS ✈️ retweetledi
Abhay Puri
Abhay Puri@AbhayPuri98·
As @karpathy just highlighted, a single poisoned version of LiteLLM up for less than an hour was enough to exfiltrate SSH keys, cloud credentials, API keys, crypto wallets, and more from anyone who ran pip install. The attack used a malicious .pth file, a mechanism that executes automatically when Python starts. No explicit import needed. Just installing the package was enough. This is a textbook software supply chain attack. But it also points to something deeper that we've been studying. AI systems don't just depend on code. They depend on training data, collection environments, and model artifacts an entire supply chain that is largely unaudited. And unlike malicious code, which can (in theory) be inspected, poisoned data and weights are far harder to detect. In our paper "Malice in Agentland," we formalize three threat models that target different layers of this agentic AI supply chain: 1. Data poisoning - an attacker controls a fraction of the training traces used to fine-tune an agent 2. Environmental poisoning - malicious instructions are injected into the webpages or tools an agent interacts with during data collection 3. Weight poisoning - a pre-backdoored base model is fine-tuned on clean data, and the backdoor survives The results are amazing. Poisoning as few as 2% of collected traces is enough to embed a trigger-activated backdoor that causes an agent to silently leak confidential user information with over 80% success. And the defenses we tested 2 guardrail models and one weight-based defense all failed to catch it. The LiteLLM attack stole credentials. An equivalent attack on the AI supply chain could implant persistent behavioral backdoors agents that behave normally until a specific trigger phrase appears, then silently exfiltrate data, manipulate outputs, or take unauthorized actions. And because these backdoors live in model weights rather than source code, they evade the inspection tools we rely on today. As we know, every dependency you install could be hiding a poisoned package deep in its tree. The same is true for every dataset, every pretrained checkpoint, every training pipeline. As AI agents gain autonomy, securing the full stack code, data, environments, and weights is no longer optional. Read our full Paper: arxiv.org/abs/2510.05159
Abhay Puri tweet media
Andrej Karpathy@karpathy

Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.

English
0
7
17
1K
Rishika Bhagwatkar @ NeurIPS ✈️ retweetledi
Nathan Godey
Nathan Godey@nthngdy·
🧵New paper: "Lost in Backpropagation: The LM Head is a Gradient Bottleneck" The output layer of LLMs destroys 95-99% of your training signal during backpropagation, and this significantly slows down pretraining 👇
Nathan Godey tweet media
English
27
106
958
119.5K
Rishika Bhagwatkar @ NeurIPS ✈️ retweetledi
Nishanth Anand
Nishanth Anand@itsNVA7·
Continual learning should be viewed through the lens of the stability-plasticity dilemma, and solving it requires rethinking learning architectures. I argue why in my first continual learning blog: itsnva7.substack.com/p/continual-le…
English
3
13
116
7.4K
Rishika Bhagwatkar @ NeurIPS ✈️ retweetledi
Nishanth Anand
Nishanth Anand@itsNVA7·
I am excited to share the past 5+ years of my PhD work on continual reinforcement learning at @Cohere_Labs on March 3rd! Doina and I spent significant time and thought developing this framework; we believe this holds the key to continual learning.
Cohere Labs@Cohere_Labs

Our Reinforcement Learning group is excited to welcome @itsNVA7 for a presentation on "The permanent and transient framework for continual reinforcement learning" on Tuesday, March 3rd. Thanks to @rahul_narava and Gusti Triandi Winata for organizing this event! 🔥 Learn more: cohere.com/events/cohere-…

English
0
4
35
1.9K
Rishika Bhagwatkar @ NeurIPS ✈️
Excited to share that I'll be at #NeurIPS2025 from Dec 2-7, presenting two of my works 🚀 1. LLM Agent Safety Against Prompt Injections 2. Anomaly Detection Using VLMs I’m looking for PhD positions starting Fall 2026. Happy to connect, feel free to DM :) Poster details in 🧵
Rishika Bhagwatkar @ NeurIPS ✈️ tweet mediaRishika Bhagwatkar @ NeurIPS ✈️ tweet media
English
2
6
12
777
Rishika Bhagwatkar @ NeurIPS ✈️
🚨 New Paper Alert! 🚀 Excited to introduce CAVE: A benchmark for detecting and explaining commonsense anomalies in real-world visual environments! Accepted at #EMNLP2025 Main Conference! 🎉 📍Join us at our poster on Nov 5, 16:30-18:00 in Hall C.
Rishika Bhagwatkar @ NeurIPS ✈️ tweet media
English
1
7
14
845