Caleb Gross

1.4K posts

Caleb Gross banner
Caleb Gross

Caleb Gross

@noperator

ai for security

Присоединился Ekim 2009
625 Подписки2.6K Подписчики
Закреплённый твит
Caleb Gross
Caleb Gross@noperator·
1/ Agentic LLMs can automate vuln detection. Very exciting, but doesn't address the hardest part (imo) of vuln research: prioritization. Can we reliably explore the search space and separate signal from noise? I wrote a paper (and OSS tool) to solve this. arxiv.org/pdf/2512.06155
Caleb Gross tweet media
English
2
58
215
100.9K
eyitemi
eyitemi@eeyitemi·
Olá 👋 @halvarflake @thegrugq I have a question that I fear may be badly posed, but I think you both are the right people to ask. I’ve been doing a bit of pressure-testing on whether ideas borrowed from measure theory can actually sharpen vulnerability research methodology, or whether they mostly give elegant language to something that is still fundamentally craft, intuition, and situational judgment. What keeps pulling me toward the analogy is that a lot of serious bugs I’ve found and reported recently seem to survive in interestingly low-coverage, high-consequence, but still reachable regions of behavior, especially where a lot of assumptions are relied on but never actually enforced. So concepts like observability, rarity, and shifts in sampling pressure somehow feel pretty relevant. But then, the more I try to operationalize it, the more I worry the formal vocabulary creates fake precision. So I’m curious: where do you think the analogy genuinely produces sustainable, scalable vuln research leverage, and at what point does it collapse into intellectual decoration?
English
2
0
2
451
solst/ICE of Astarte
‼️🚨 BREAKING: It has come to my attention that some of you are not following @noperator He has a five-digit IQ and is working on a bunch of cool projects like SiftRank and Cagent Please follow asap
ɐʞsǝs@akses_0x00

@IceSolst @noperator yes! love this and thanks for the SiftRank tip... how was I not following @noperator until now... fixed

English
13
5
78
8.6K
Zack Korman
Zack Korman@ZackKorman·
Every AI agent sandbox project
Zack Korman tweet media
English
9
16
131
7K
Daniel Sempere Pico
Daniel Sempere Pico@dansemperepico·
You guys all run Claude Code with claude --dangerously-skip-permissions right? Because otherwise how in the world can you sit there accepting every single permission when building something?
English
474
22
2.2K
285.5K
Caleb Gross
Caleb Gross@noperator·
@IceSolst @stokfredrik I have a branch (not pushed to github yet) that does full mitm proxy of all network egress. can specify allowlist of hosts, ports, protocols, http methods/paths, etc.
English
1
0
4
106
STÖK ✌️
STÖK ✌️@stokfredrik·
What is the most efficient and easy way to setup a solution today for Claud code segmentation/sandboxing, without loosing to much performance? What I want : - a secure way to run Claud code + tools with full access to a shell on laptop (independent of the os) I want it to be able to install apps, dependencies you name it on the fly inside its ”home”. - egress over network, so it can send / route traffic through a proxy like burp/caido for logging purposes, passive audits and manual evaluations. But no other host / access, findings will be sent back into the workflow for validation. - files / memory / context dumps synced over git, rsync or similar, - a easy snapshot functionality so I’m able to roll back and get em back up running fast when it eats itself. Any ideas? I could easily ask the llm, but I want some human input around it.
English
25
11
112
15.7K
Caleb Gross
Caleb Gross@noperator·
@mx_schmitt Thanks for all of your work on Playwright (especially for Go). Congrats on the new role!
English
0
0
1
21
Max Schmitt
Max Schmitt@mx_schmitt·
If you’re in the Bay Area and working on browser use, agents, or AI automation, happy to connect.
English
1
0
4
232
Max Schmitt
Max Schmitt@mx_schmitt·
Excited to share that after 5 awesome years working on Playwright, I’ve moved from Berlin to San Francisco to join Amazon AGI Lab, working on browser use and AI Agents. I’m extremely grateful to the Playwright team and community for all the support over the last few years!
English
7
3
83
4.7K
Caleb Gross
Caleb Gross@noperator·
Great article. Two questions after reading: - There are certain skills needed for a researcher to succeed at low (vs. high) points in the abstraction stack. How much do they overlap? - How do we reason about the economics of VR if we don't feel the true unsubsidized cost of AI?
Caleb Gross tweet media
chrisrohlf@chrisrohlf

Shrinking Margins: Frontier models don't perform vulnerability discovery the way traditional tools do, they reason through code the way humans do, and the margin left for human researchers is rapidly shrinking. secure.dev/shrinking_marg…

English
0
1
2
1.3K
Caleb Gross ретвитнул
Richard Johnson
Richard Johnson@richinseattle·
Spread the word! @phrack CFP with demoscene cracktro is live. Turn up the volume and enjoy the awesome stylings of @PiotrBania with some hopefully inspiring text from phrack staff :) phrack.org
Richard Johnson tweet media
English
6
133
249
37.5K
@levelsio
@levelsio@levelsio·
The 3-2-1 Backup Rule is more important than ever if you code with AI because fatal accidents can happen It means you should have 3 copies of your data, in 2 different media types and 1 copy off-site 1) One is the actual data on your own server (the hard drive) or DB server 2) One backup is in cloud storage (that's the different media type) 3) One backup is off site, at another provider, and preferrably in another geographical location For me that's 1) Hetzner VPS, 2) Hetzner's own daily and weekly backups on the dashboard, and 3) Backblaze B2 Hetzner's own backups are impossible to access by the VPS or AI, so that's safer If you use AWS or other providers you can apply the 3-2-1 Backup Rule in your own way I've never lost any data!
Alexey Grigorev@Al_Grigor

Claude Code wiped our production database with a Terraform command. It took down the DataTalksClub course platform and 2.5 years of submissions: homework, projects, and leaderboards. Automated snapshots were gone too. In the newsletter, I wrote the full timeline + what I changed so this doesn't happen again. If you use Terraform (or let agents touch infra), this is a good story for you to read. alexeyondata.substack.com/p/how-i-droppe…

English
125
151
2.2K
428.4K
Caleb Gross ретвитнул
Aaron Grattafiori
Aaron Grattafiori@dyn___·
Yeah some of us (and Caleb) have been saying this for a bit now. The finding is crazy now, the triage and exploiting is the next hurdle, but it will also fall (as @seanhn has been pointing out), or it just requires specific agents...(Cont)..
Caleb Gross@noperator

anthropic.com/news/mozilla-f…

English
1
4
28
4.9K
Caleb Gross
Caleb Gross@noperator·
The vuln discovery is amazing and I don't mean to downplay that. But worth noting the (current) gap between discovery and exploitation.
English
2
0
4
570
dawgyg - WoH
dawgyg - WoH@thedawgyg·
Just remember... When your AI agent accidentally deletes a production database on your target while your letting it do the hacking for you, your the one that will face charges, not the bot.
English
13
27
267
15.6K
Josh Avraham
Josh Avraham@josh_avraham·
Thinking about switching from Alacritty to Ghostty
English
1
0
0
352
Caleb Gross ретвитнул
Simone Margaritelli
Simone Margaritelli@evilsocket·
Just managed to run distributed inference clustering an NVIDIA gpu, a MacBook Pro and and iPhone 16 🔥 metal acceleration on the mobile node working like a charm. Cake (in rust) is now the only project that allows you to distribute your local inference on mobile, Mac and Linux.
Simone Margaritelli tweet media
English
14
23
192
14.8K