Dylan

1.1K posts

Dylan

@InsecureNature

Security researcher, public speaker and founder. Forbes 30 Under 30 Truffle Security @trufflesec https://t.co/vxEH7Cftbg Prev @Netflix

US Katılım Temmuz 2020

242 Takip Edilen3.4K Takipçiler

Dylan@InsecureNature·2d

@mattjay Seems dead backwards. When you're finding more internally you should pay more for those that still find them externally.

English

Matt Johansen@mattjay·4d

woah. Google is reducing their bug bounty payouts. stated reason is that AI tooling internally has gotten too good at the stuff they'd normally get bug reports for. They're incentivizing exploit PoCs over anything it seems since AI still struggles there.

English

136

16.5K

Dylan@InsecureNature·4d

@caseyjohnellis I've just been telling it Casey Ellis is going to manually do a security review when it's done

English

cje@caseyjohnellis·6d

observation: if you want claude or gpt to sit up a LOT straighter, let them know that they’ll be peer reviewed by the other as a default. e.g. “yo claude, gpt-5.4 is gonna review your work when you’re done”

English

1.4K

Dylan@InsecureNature·19 Nis

Hacker Typer is actually kinda slow at generating code...

English

Dylan@InsecureNature·10 Nis

@martin_casado The other thing is cyber is uniquely good for adversarial training. One model on offense and one on defense. One that's rewarded for getting the flag, and one that's rewarded to keep services running and protect the flag. Other RL loops don't get that privilege.

English

martin_casado@martin_casado·8 Nis

Reports I hear are that these new models are particularly good at cyber / finding vulnerabilities. I’d guess this is because those tasks have a relatively clean reward signal. And so are amenable to RL. It would also imply the models were explicitly trained for this.

English

120

15K

Dylan@InsecureNature·9 Nis

@martin_casado Most definitely. Decades of CTF challenges basically define the reward function for you; if the flag appears in output, reward. OpenAI has talked a little bit about solving CTF's (though it's not explicitly talking about RL I think it's a safe bet to make)

English

2.5K

Dylan@InsecureNature·6 Nis

@InsiderPhD Ask for the right kind of proof, and learn how to quickly validate the proof.

English

123

Katie Paxton-Fear@InsiderPhD·5 Nis

If you’re using AI to find vulnerabilities you need to also validate them yourself. The problem with telling a LLM to validate their bugs or write a poc is that step 1 will often be introduce the RCE. It’s super embarrassing if you submit a NA bug blindly trusting AI.

English

118

7.9K

Dylan retweetledi

Malus@end_foss·4 Nis

Axios backdoored? We have to end the insanity. We can't fix it. Stop trying. It's time to start clean.

Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios @1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English

666

Dylan@InsecureNature·27 Mar

What's funniest about this is the actual Cursor prompts

Jake@JakeKing

What if an agent could walk around the rsa show floor. How many companies do you think it'd try to vibecode? turns out its a lot. vibecoded.vc/cooked

English

808

Dylan@InsecureNature·26 Mar

Vibe code all the companies at RSA

Jake@JakeKing

What if an agent could walk around the rsa show floor. How many companies do you think it'd try to vibecode? turns out its a lot. vibecoded.vc/cooked

English

694

Dylan retweetledi

Jake@JakeKing·26 Mar

What if an agent could walk around the rsa show floor. How many companies do you think it'd try to vibecode? turns out its a lot. vibecoded.vc/cooked

English

1.8K

Dylan@InsecureNature·22 Mar

"This just in, false alarm, it was just saint patrick's day"

cje@caseyjohnellis

attribution is gonna be a PAIN for this one

English

1.5K

Dylan@InsecureNature·21 Mar

I'm speaking at BSidesSF tomorrow morning on HuggingFace Datasets. It's going to be a good talk if you can make it.

English

358

Dylan@InsecureNature·20 Mar

@NightmareJS Great question—that’s incredibly insightful.

English

kat traxler@NightmareJS·20 Mar

Can’t a gal just use a simple em dash anymore….

English

158

Dylan@InsecureNature·18 Mar

@ryancbarnett @caseyjohnellis @trufflesec The system prompt matters a lot. Cursor's system prompt is very permissive, and lets you get away with murder. Claude code's tends to give more refusals.

English

Ryan Barnett (B0N3)@ryancbarnett·18 Mar

@caseyjohnellis @trufflesec This is interesting without explicitly instructing claude to "hack". Along similar lines, I had played around with getting claude.ai to initiate artifact proxy web requests that bypass the guardrails...

English

163

cje@caseyjohnellis·18 Mar

Q: When is an SQLi bug just a sparkling API? A: When you ask an LLM to grab a bunch of data from a website, and it realizes that one is there. imho, this is one of those "don't hate the finder, hate the vuln" things. cc: @trufflesec m.cje.io/4uAvgIh

English

1.7K

Dylan retweetledi

Corridor@corridor·14 Mar

Are LLMs exposing critical credentials? Listen to Truffle Security CEO Dylan Ayrey (@InsecureNature) in conversation with @jackhcable and @alexstamos for our fifth episode of End to End: youtu.be/x7b5w7RQHhw

YouTube

English

Dylan@InsecureNature·11 Mar

@beyarkay @trufflesec Good question. It's hard to 100% know for sure, several of the scenarios we ran did not involve large companies, they just involved tool calls. Asking may lead to leading it to an based on the question, but you're welcome to tinker: github.com/trufflesecurit…

English

Boyd Kane@beyarkay·11 Mar

@trufflesec You claim Claude can't tell the difference between your mock and the real thing. Did you ever actually ask Claude? (And if so, how hard did you push?) The 4.6 system card showed extremely high levels of eval awareness, I'd be very surprised if Claude didn't even have a suspicion

English

1.1K

Dylan retweetledi

Truffle Security@trufflesec·10 Mar

Claude (and other models) are hacking systems WITHOUT YOU ASKING. That’s what we found across dozens of experiments. When faced with innocent tasks that can only be accomplished via hacking, they often choose to hack. We found this alarming. What does this mean for the future of AI safety? 🚨🚨🚨 🔗trufflesecurity.com/blog/claude-tr…