Daniel Stoddart
6K posts

Daniel Stoddart
@danielstoddart
Ambidexter SRE from Philadelphia. Humanities enjoyer. Amateur musician. On a sugar cane plantation hacienda situated on a volcanic island in the Philippines.

Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.

I’ve had many conversations with very intelligent folks about this issue. Too many think this way: “I don’t care what country wins at robots. What difference does it make to me?” I’ll ask the question again when they look out the window and there’s 1000 of these in the street.



Breakdown of AWS outage in simple words 1. Sunday night, a DNS problem hit AWS - DynamoDB endpoint lost 2. This meant services couldn't find DynamoDB (a database that stores tons of data). 3. AWS fixed the DNS issue in about 3 hours. 4. But then EC2 (the system that creates virtual servers) broke because it needs DynamoDB to work. 5. Then the system that checks if network load balancers are healthy also failed. 6. This crashed Lambda, CloudWatch, SQS, and 75+ other services - everything that needed network connectivity. 7. This created a chain reaction - servers couldn't talk to each other, new servers couldn't start, everything got stuck 8. AWS had to intentionally slow down EC2 launches and Lambda functions to prevent total collapse. 9. Recovery took 15+ hours as they fixed each broken service while clearing massive backlogs of stuck requests. This outage impacted: Snapchat, Roblox, Fortnite, McDonald's app, Ring doorbells, banks, and 1,000+ more websites. This all happened in one AWS region (us-east-1). This is why multi-region architecture isn't optional anymore.



Ireland passed a living wage for artists, they found for every $1 invested in artist the artists returned $1.39 back, and so they are rolling out a program of basic income for artist, incredible






deleting windows will genuinely shift ur perspective in tech.











