Rishika Bhagwatkar @ NeurIPS ✈️

99 posts

Rishika Bhagwatkar @ NeurIPS ✈️

@rishika2110

MSc in CS at Mila Quebec and UdeM

Montréal, Québec Katılım Mart 2021

327 Takip Edilen169 Takipçiler

Sabitlenmiş Tweet

Rishika Bhagwatkar @ NeurIPS ✈️@rishika2110·5 Kas

🚨 New Paper Alert! 🚀 Excited to introduce CAVE: A benchmark for detecting and explaining commonsense anomalies in real-world visual environments! Accepted at #EMNLP2025 Main Conference! 🎉 📍Join us at our poster on Nov 5, 16:30-18:00 in Hall C.

Rishika Bhagwatkar @ NeurIPS ✈️ tweet media

English

845

Rishika Bhagwatkar @ NeurIPS ✈️ retweetledi

Abhay Puri@AbhayPuri98·24 Mar

As @karpathy just highlighted, a single poisoned version of LiteLLM up for less than an hour was enough to exfiltrate SSH keys, cloud credentials, API keys, crypto wallets, and more from anyone who ran pip install. The attack used a malicious .pth file, a mechanism that executes automatically when Python starts. No explicit import needed. Just installing the package was enough. This is a textbook software supply chain attack. But it also points to something deeper that we've been studying. AI systems don't just depend on code. They depend on training data, collection environments, and model artifacts an entire supply chain that is largely unaudited. And unlike malicious code, which can (in theory) be inspected, poisoned data and weights are far harder to detect. In our paper "Malice in Agentland," we formalize three threat models that target different layers of this agentic AI supply chain: 1. Data poisoning - an attacker controls a fraction of the training traces used to fine-tune an agent 2. Environmental poisoning - malicious instructions are injected into the webpages or tools an agent interacts with during data collection 3. Weight poisoning - a pre-backdoored base model is fine-tuned on clean data, and the backdoor survives The results are amazing. Poisoning as few as 2% of collected traces is enough to embed a trigger-activated backdoor that causes an agent to silently leak confidential user information with over 80% success. And the defenses we tested 2 guardrail models and one weight-based defense all failed to catch it. The LiteLLM attack stole credentials. An equivalent attack on the AI supply chain could implant persistent behavioral backdoors agents that behave normally until a specific trigger phrase appears, then silently exfiltrate data, manipulate outputs, or take unauthorized actions. And because these backdoors live in model weights rather than source code, they evade the inspection tools we rely on today. As we know, every dependency you install could be hiding a poisoned package deep in its tree. The same is true for every dataset, every pretrained checkpoint, every training pipeline. As AI agents gain autonomy, securing the full stack code, data, environments, and weights is no longer optional. Read our full Paper: arxiv.org/abs/2510.05159

Andrej Karpathy@karpathy

Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.

English

Rishika Bhagwatkar @ NeurIPS ✈️ retweetledi

Nathan Godey@nthngdy·12 Mar

🧵New paper: "Lost in Backpropagation: The LM Head is a Gradient Bottleneck" The output layer of LLMs destroys 95-99% of your training signal during backpropagation, and this significantly slows down pretraining 👇

English

106

958

119.5K

Rishika Bhagwatkar @ NeurIPS ✈️ retweetledi

Nishanth Anand@itsNVA7·26 Şub

Continual learning should be viewed through the lens of the stability-plasticity dilemma, and solving it requires rethinking learning architectures. I argue why in my first continual learning blog: itsnva7.substack.com/p/continual-le…

English

116

7.4K

Rishika Bhagwatkar @ NeurIPS ✈️ retweetledi

Nishanth Anand@itsNVA7·24 Şub

I am excited to share the past 5+ years of my PhD work on continual reinforcement learning at @Cohere_Labs on March 3rd! Doina and I spent significant time and thought developing this framework; we believe this holds the key to continual learning.

Cohere Labs@Cohere_Labs

Our Reinforcement Learning group is excited to welcome @itsNVA7 for a presentation on "The permanent and transient framework for continual reinforcement learning" on Tuesday, March 3rd. Thanks to @rahul_narava and Gusti Triandi Winata for organizing this event! 🔥 Learn more: cohere.com/events/cohere-…

English

1.9K

Rishika Bhagwatkar @ NeurIPS ✈️@rishika2110·3 Ara

Excited to share that I'll be at #NeurIPS2025 from Dec 2-7, presenting two of my works 🚀 1. LLM Agent Safety Against Prompt Injections 2. Anomaly Detection Using VLMs I’m looking for PhD positions starting Fall 2026. Happy to connect, feel free to DM :) Poster details in 🧵

English

777

Rishika Bhagwatkar @ NeurIPS ✈️@rishika2110·3 Ara

2. ⁠⁠Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks? 📍Lock-LLM Workshop on 6 Dec, Room 1AB from 12-1:00 PM paper link: arxiv.org/pdf/2510.05244

English

Rishika Bhagwatkar @ NeurIPS ✈️@rishika2110·3 Ara

1. CAVE: Commonsense Anomalies in Visual Environments 📍Women in ML on 2 Dec, Exhibit Hall C from 6-9 PM 📍LLM-Eval Workshop on 7 Dec, Room 2 from 2-3:30 PM paper link: aclanthology.org/2025.emnlp-mai…

English

Rishika Bhagwatkar @ NeurIPS ✈️@rishika2110·3 Ara

2. ⁠⁠Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks? 📍Lock-LLM Workshop on 6 Dec, Room 1AB from 12-1:00 PM paper link: arxiv.org/pdf/2510.05244

English

Rishika Bhagwatkar @ NeurIPS ✈️@rishika2110·3 Ara

English

Rishika Bhagwatkar @ NeurIPS ✈️@rishika2110·5 Kas

English

845

Rishika Bhagwatkar @ NeurIPS ✈️@rishika2110·5 Kas

I would like to thank my co-authors @SyrielleMontar1 , @agromanou @obiwit @irinarish and @ABosselut from @EPFL_en, @Mila_Quebec and @UMontreal for their support and help! 😇❣️

English

470

Rishika Bhagwatkar @ NeurIPS ✈️@rishika2110·5 Kas

Unfortunately, my visa was rejected, so I won’t be able to attend in person. However, Syrielle Montariol will be presenting the poster! Feel free to stop by if you'd like to chat about it! Check out the full paper here: smontariol.github.io/cave-visual-an….

English

Rishika Bhagwatkar @ NeurIPS ✈️ retweetledi

Mila - Institut québécois d'IA@Mila_Quebec·15 Eki

Mila's annual supervision request process is now open to receive MSc and PhD applications for Fall 2026 admission! For more information, visit mila.quebec/en/prospective…

Mila - Institut québécois d'IA tweet media

English

120

105.2K

Keşfet

@karpathy @Cohere_Labs @SyrielleMontar1 @agromanou @obiwit @irinarish @ABosselut @EPFL_en