Suraj Deshmukh | सुरज देशमुख

5.2K posts

Suraj Deshmukh | सुरज देशमुख banner
Suraj Deshmukh | सुरज देशमुख

Suraj Deshmukh | सुरज देशमुख

@surajd_

Confidential Containers @Microsoft | ex-@kinvolkio ex-@RedHat | bibliophile | He/Him | Opinions are my own. 🟥🟩 🟦🟨

SEA | IXU | BLR Katılım Nisan 2011
2K Takip Edilen1.7K Takipçiler
Suraj Deshmukh | सुरज देशमुख retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.
English
1.6K
5.4K
39.5K
7.7M
Suraj Deshmukh | सुरज देशमुख retweetledi
Aakash Gupta
Aakash Gupta@aakashgupta·
Someone just poisoned the Python package that manages AI API keys for NASA, Netflix, Stripe, and NVIDIA.. 97 million downloads a month.. and a simple pip install was enough to steal everything on your machine. The attacker picked the one package whose entire job is holding every AI credential in the organization in one place. OpenAI keys, Anthropic keys, Google keys, Amazon keys… all routed through one proxy. All compromised at once. The poisoned version was published straight to PyPI.. no code on GitHub.. no release tag.. no review. Just a file that Python runs automatically on startup. You didn’t need to import it. You didn’t need to call it. The malware fired the second the package existed on your machine. The attacker vibe coded it… the malware was so sloppy it crashed computers.. used so much RAM a developer noticed their machine dying and investigated. They found LiteLLM had been pulled in through a Cursor MCP plugin they didn’t even know they had. That crash is the only reason thousands of companies aren’t fully exfiltrated right now. If the code had been cleaner nobody notices for weeks. Maybe months. The attack chain is the part that gets worse every sentence. TeamPCP compromised Trivy first. A security scanning tool. On March 19. LiteLLM used Trivy in its own CI pipeline… so the credentials stolen from the SECURITY product were used to hijack the AI product that holds all your other credentials. Then they hit GitHub Actions. Then Docker Hub. Then npm. Then Open VSX. Five package ecosystems in two weeks. Each breach giving them the credentials to unlock the next one. The payload was three stages.. harvest every SSH key, cloud token, Kubernetes secret, crypto wallet, and .env file on the machine.. deploy privileged containers across every node in the cluster.. install a persistent backdoor waiting for new instructions. TeamPCP posted on Telegram after: “Many of your favourite security tools and open-source projects will be targeted in the months to come.. stay tuned.” Every AI agent, copilot, and internal tool your company shipped this year runs on hundreds of packages exactly like this one… nobody chose to install LiteLLM on that developer’s machine. It came in as a dependency of a dependency of a plugin. One compromised maintainer account turned the entire trust chain into a credential harvesting operation across thousands of production environments in hours. The companies deploying AI the fastest right now have the least visibility into what’s underneath it.
Andrej Karpathy@karpathy

Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.

English
297
2.2K
10.9K
2.7M
Suraj Deshmukh | सुरज देशमुख retweetledi
Daniel Hnyk
Daniel Hnyk@hnykda·
LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below
English
309
2.3K
9.4K
5.7M
Suraj Deshmukh | सुरज देशमुख retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.
Daniel Hnyk@hnykda

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

English
1.4K
5.4K
28K
66.4M
Suraj Deshmukh | सुरज देशमुख
2/n Agentic engineering is structured: start with TDD, run tests before touching code, delegate to sub-agents for test execution, automate your manual tests, and move seamlessly from prototype to real implementation.
English
1
0
0
31
Suraj Deshmukh | सुरज देशमुख
1/n I just spent time reading @simonw’s guide on agentic engineering patterns and it shifted how I think about coding with AI. The mindset isn’t “let the AI figure it out” — that’s vibe-coding™️.
English
2
0
1
107
Suraj Deshmukh | सुरज देशमुख retweetledi
Thariq
Thariq@trq212·
Using Skills well is a skill issue. I didn't quite realize how much until I wrote this, the best can completely transform how your team works.
Thariq@trq212

x.com/i/article/2033…

English
90
237
3.8K
691.5K
Suraj Deshmukh | सुरज देशमुख
1/2) I built a Goodreads skill for Openclaw! Goodreads killed their API in 2020. This skill brings it back — via browser automation. ✅ Search books ✅ Get details & reviews ✅ Personalized recommendations ✅ Manage your reading lists Just type /goodreads search for dune and go.
English
1
0
1
66