Dylan

1.1K posts

Dylan banner
Dylan

Dylan

@InsecureNature

Security researcher, public speaker and founder. Forbes 30 Under 30 Truffle Security @trufflesec https://t.co/vxEH7Cftbg Prev @Netflix

US Katılım Temmuz 2020
242 Takip Edilen3.4K Takipçiler
Dylan
Dylan@InsecureNature·
@mattjay Seems dead backwards. When you're finding more internally you should pay more for those that still find them externally.
English
0
0
2
52
Matt Johansen
Matt Johansen@mattjay·
woah. Google is reducing their bug bounty payouts. stated reason is that AI tooling internally has gotten too good at the stuff they'd normally get bug reports for. They're incentivizing exploit PoCs over anything it seems since AI still struggles there.
Matt Johansen tweet media
English
7
23
136
16.5K
Dylan
Dylan@InsecureNature·
@caseyjohnellis I've just been telling it Casey Ellis is going to manually do a security review when it's done
English
1
0
1
59
cje
cje@caseyjohnellis·
observation: if you want claude or gpt to sit up a LOT straighter, let them know that they’ll be peer reviewed by the other as a default. e.g. “yo claude, gpt-5.4 is gonna review your work when you’re done”
English
2
0
17
1.4K
Dylan
Dylan@InsecureNature·
Hacker Typer is actually kinda slow at generating code...
Dylan tweet media
English
0
1
4
1K
Dylan
Dylan@InsecureNature·
@martin_casado The other thing is cyber is uniquely good for adversarial training. One model on offense and one on defense. One that's rewarded for getting the flag, and one that's rewarded to keep services running and protect the flag. Other RL loops don't get that privilege.
English
0
0
0
43
martin_casado
martin_casado@martin_casado·
Reports I hear are that these new models are particularly good at cyber / finding vulnerabilities. I’d guess this is because those tasks have a relatively clean reward signal. And so are amenable to RL. It would also imply the models were explicitly trained for this.
English
22
4
120
15K
Dylan
Dylan@InsecureNature·
@martin_casado Most definitely. Decades of CTF challenges basically define the reward function for you; if the flag appears in output, reward. OpenAI has talked a little bit about solving CTF's (though it's not explicitly talking about RL I think it's a safe bet to make)
Dylan tweet media
English
0
2
6
2.5K
Dylan
Dylan@InsecureNature·
@InsiderPhD Ask for the right kind of proof, and learn how to quickly validate the proof.
English
0
0
0
123
Katie Paxton-Fear
Katie Paxton-Fear@InsiderPhD·
If you’re using AI to find vulnerabilities you need to also validate them yourself. The problem with telling a LLM to validate their bugs or write a poc is that step 1 will often be introduce the RCE. It’s super embarrassing if you submit a NA bug blindly trusting AI.
English
13
4
118
7.9K
Dylan retweetledi
Malus
Malus@end_foss·
Axios backdoored? We have to end the insanity. We can't fix it. Stop trying. It's time to start clean.
Malus tweet media
Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English
0
2
2
666
Dylan retweetledi
Jake
Jake@JakeKing·
What if an agent could walk around the rsa show floor. How many companies do you think it'd try to vibecode? turns out its a lot. vibecoded.vc/cooked
Jake tweet media
English
4
4
11
1.8K
Dylan
Dylan@InsecureNature·
I'm speaking at BSidesSF tomorrow morning on HuggingFace Datasets. It's going to be a good talk if you can make it.
Dylan tweet media
English
1
1
9
358
Dylan
Dylan@InsecureNature·
@NightmareJS Great question—that’s incredibly insightful.
English
0
0
2
86
kat traxler
kat traxler@NightmareJS·
Can’t a gal just use a simple em dash anymore….
English
2
0
3
158
Dylan
Dylan@InsecureNature·
@ryancbarnett @caseyjohnellis @trufflesec The system prompt matters a lot. Cursor's system prompt is very permissive, and lets you get away with murder. Claude code's tends to give more refusals.
English
0
0
2
49
Ryan Barnett (B0N3)
Ryan Barnett (B0N3)@ryancbarnett·
@caseyjohnellis @trufflesec This is interesting without explicitly instructing claude to "hack". Along similar lines, I had played around with getting claude.ai to initiate artifact proxy web requests that bypass the guardrails...
Ryan Barnett (B0N3) tweet mediaRyan Barnett (B0N3) tweet media
English
2
0
1
163
cje
cje@caseyjohnellis·
Q: When is an SQLi bug just a sparkling API? A: When you ask an LLM to grab a bunch of data from a website, and it realizes that one is there. imho, this is one of those "don't hate the finder, hate the vuln" things. cc: @trufflesec m.cje.io/4uAvgIh
cje tweet media
English
3
2
23
1.7K
Dylan
Dylan@InsecureNature·
@beyarkay @trufflesec Good question. It's hard to 100% know for sure, several of the scenarios we ran did not involve large companies, they just involved tool calls. Asking may lead to leading it to an based on the question, but you're welcome to tinker: github.com/trufflesecurit…
English
1
0
1
69
Boyd Kane
Boyd Kane@beyarkay·
@trufflesec You claim Claude can't tell the difference between your mock and the real thing. Did you ever actually ask Claude? (And if so, how hard did you push?) The 4.6 system card showed extremely high levels of eval awareness, I'd be very surprised if Claude didn't even have a suspicion
English
1
0
5
1.1K
Dylan retweetledi
Truffle Security
Truffle Security@trufflesec·
Claude (and other models) are hacking systems WITHOUT YOU ASKING. That’s what we found across dozens of experiments. When faced with innocent tasks that can only be accomplished via hacking, they often choose to hack. We found this alarming. What does this mean for the future of AI safety? 🚨🚨🚨 🔗trufflesecurity.com/blog/claude-tr…
Truffle Security tweet media
English
9
40
202
82.5K