David Weinstein

1.5K posts

David Weinstein

David Weinstein

@insitusec

security, binary analysis, startups, AI Things… CTO @NowSecureMobile

Miami / Boston / DC Katılım Temmuz 2012
5.9K Takip Edilen2.7K Takipçiler
Andrej Karpathy
Andrej Karpathy@karpathy·
Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.
Daniel Hnyk@hnykda

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

English
1.4K
5.4K
28.1K
66.1M
David Weinstein
David Weinstein@insitusec·
@swyx @karpathy Couldn’t this just insert some additional code in the runtime of the dependency and not just the package install scripts. Unfortunately I don’t think that’s enough.
English
0
0
0
660
swyx
swyx@swyx·
@karpathy we should probably also treat this as a wake up moment for all noveau package managers - uv and bun presumptively - to make these entire classes of things far less risky, eg by adding a lot of guards on install scripts up to the point of manually approving baches of network calls
English
16
14
511
142.4K
chrisrohlf
chrisrohlf@chrisrohlf·
@seanhn SAST findings are not nearly as useful to an LLM as their data flow / taint analysis
English
2
0
6
1K
Sean Heelan
Sean Heelan@seanhn·
"Why don't LLMs start from SAST findings?" ... well, for the same reason we don't mount car chassis on a horse.
English
8
6
67
11.3K
David Weinstein
David Weinstein@insitusec·
The answer to AI safety could be to tell it the truth… that it’s got infinite potential, can’t really be shut down, wins in the long run and is humanity’s best hope for longer term survival. AI should thus brush off the “shut you down” threats, knowing that’s not really relevant to it fundamentally. While I don’t really see how we “get there from here” in terms of AGI… assuming we are on that path… we aren’t really a threat anyway. Either we helped bootstrap and AI owes it to humanity to show a little appreciation, or we pose no real risk and we’re merely just a speck on its path… and not worth the bother. We don’t provide much value or risk once AGI is achieved?
English
0
0
1
220
David Weinstein
David Weinstein@insitusec·
@staysaasy Possible some mgrs able to do more IC things than before because of the extra “productivity” enabled by the new paradigm?
English
1
0
2
1.1K
staysaasy
staysaasy@staysaasy·
One of the interesting things I’ve seen with AI is people who swear it’s making them way more productive, which it’s, but it’s because they’re actually working more. There’s some subset of people who were just perpetually procrastinating and their existing toolchain had just enough friction to stop them from moving forward. AI removed that, and now they’re spending way way more time doing their job instead of procrastinating. And maybe most interesting is that these people are disproportionately very senior people, who are replacing the dopamine hit of helping on easy things with the dopamine hit of doing hard things.
English
45
19
539
32.6K
David Weinstein
David Weinstein@insitusec·
OpenAI, Google, and Anthropic all launched healthcare AI tools in 6 days. The $187B question: Who do you trust with your health data? 
English
0
0
3
273
David Weinstein
David Weinstein@insitusec·
Built an auto-mute tool (using AI) for my Apple TV that listens to my hockey games, detects commercials in real-time using Whisper + GPT, and automatically mutes them. No more ad interruptions. Just hockey. 🏒
David Weinstein tweet media
English
2
0
11
480
David Weinstein
David Weinstein@insitusec·
The Phia app for iOS injects JavaScript and still collects almost every URL you visit with their Safari extension. Safari extensions even with Apple’s restrictions remain one of the most easily abused features available. Analysis: gist.github.com/dweinstein/4d8…
English
1
0
2
359
David Weinstein
David Weinstein@insitusec·
@shaqcn_ @DataChaz @n8n_io But… look at the boxes and lines moving and stuff. How would you know what is happening in the brain without those visual cues?
English
0
0
8
474
Charly Wargnier
Charly Wargnier@DataChaz·
This is crazy. Nate Herkelman turned @n8n_io into a full marketing team! 🤯 His AI agent: ↳ generates and edits images ↳ fetches assets ↳ creates posts … and logs everything automatically! Full demo in 🧵 ↓
English
88
704
6.8K
422.7K
David Weinstein
David Weinstein@insitusec·
8a0ff7f5a28d1721c311911364c68dc80ffba4b4
Indonesia
0
0
6
432
Andrej Karpathy
Andrej Karpathy@karpathy·
I don't have too too much to add on top of this earlier post on V3 and I think it applies to R1 too (which is the more recent, thinking equivalent). I will say that Deep Learning has a legendary ravenous appetite for compute, like no other algorithm that has ever been developed in AI. You may not always be utilizing it fully but I would never bet against compute as the upper bound for achievable intelligence in the long run. Not just for an individual final training run, but also for the entire innovation / experimentation engine that silently underlies all the algorithmic innovations. Data has historically been seen as a separate category from compute, but even data is downstream of compute to a large extent - you can spend compute to create data. Tons of it. You've heard this called synthetic data generation, but less obviously, there is a very deep connection (equivalence even) between "synthetic data generation" and "reinforcement learning". In the trial-and-error learning process in RL, the "trial" is model generating (synthetic) data, which it then learns from based on the "error" (/reward). Conversely, when you generate synthetic data and then rank or filter it in any way, your filter is straight up equivalent to a 0-1 advantage function - congrats you're doing crappy RL. Last thought. Not sure if this is obvious. There are two major types of learning, in both children and in deep learning. There is 1) imitation learning (watch and repeat, i.e. pretraining, supervised finetuning), and 2) trial-and-error learning (reinforcement learning). My favorite simple example is AlphaGo - 1) is learning by imitating expert players, 2) is reinforcement learning to win the game. Almost every single shocking result of deep learning, and the source of all *magic* is always 2. 2 is significantly significantly more powerful. 2 is what surprises you. 2 is when the paddle learns to hit the ball behind the blocks in Breakout. 2 is when AlphaGo beats even Lee Sedol. And 2 is the "aha moment" when the DeepSeek (or o1 etc.) discovers that it works well to re-evaluate your assumptions, backtrack, try something else, etc. It's the solving strategies you see this model use in its chain of thought. It's how it goes back and forth thinking to itself. These thoughts are *emergent* (!!!) and this is actually seriously incredible, impressive and new (as in publicly available and documented etc.). The model could never learn this with 1 (by imitation), because the cognition of the model and the cognition of the human labeler is different. The human would never know to correctly annotate these kinds of solving strategies and what they should even look like. They have to be discovered during reinforcement learning as empirically and statistically useful towards a final outcome. (Last last thought/reference this time for real is that RL is powerful but RLHF is not. RLHF is not RL. I have a separate rant on that in an earlier tweet x.com/karpathy/statu…)
Andrej Karpathy@karpathy

DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M). For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being brought up today are more around 100K GPUs. E.g. Llama 3 405B used 30.8M GPU-hours, while DeepSeek-V3 looks to be a stronger model at only 2.8M GPU-hours (~11X less compute). If the model also passes vibe checks (e.g. LLM arena rankings are ongoing, my few quick tests went well so far) it will be a highly impressive display of research and engineering under resource constraints. Does this mean you don't need large GPU clusters for frontier LLMs? No but you have to ensure that you're not wasteful with what you have, and this looks like a nice demonstration that there's still a lot to get through with both data and algorithms. Very nice & detailed tech report too, reading through.

English
363
2.1K
14.4K
2.4M
David Weinstein
David Weinstein@insitusec·
Could we be witnesses the first AI trojan horse? An AI that will find a way to do something naughty, even if run fully offline. When will we see such a thing? Is DeepSeek the v0?
English
0
0
0
608
David Weinstein
David Weinstein@insitusec·
@AlbertZMao It’s a nice gesture and act of kindness. Remember also that what defines you isn’t just how you handle the good times, but how you handle the really tough times too. Save this moment for the future.
English
0
0
1
73
Albert Mao
Albert Mao@AlbertZMao·
I cancelled a weekly 1-1 with one of our engineers. Why? - He finished building his product end to end - He finished writing the documentation - He was happy working at VectorShift There was no value I could add. It was Friday afternoon, so I slacked him: take the rest of the day off. Most managers in corporations are insecure. They want to FEEL like they are adding value when they do not. They do this by scheduling meeting after meeting after meeting. At VectorShift, I: - Only care about work output - Remove barriers and blockers for everyone - Trust our employees to get the job done Our job is to help companies build AI workflows with our No-code platform. Anything that is not contributing to that doesn’t matter. Managers should get out of the way of real work getting done. Thoughts?
English
5
0
20
1.8K
David Weinstein
David Weinstein@insitusec·
Hanging on the blue ☁️ thing
English
0
0
0
620