Max Andreacchi

76 posts

Max Andreacchi banner
Max Andreacchi

Max Andreacchi

@atomicchonk

Security researcher exploring AI-driven offensive security 🤖 Attack paths / agent workflows / impact

United States Katılım Eylül 2025
168 Takip Edilen51 Takipçiler
Sabitlenmiş Tweet
Max Andreacchi
Max Andreacchi@atomicchonk·
AI is changing how we approach offensive security, and it’s starting to reshape what the role of a pentester actually looks like. 🧵
English
9
0
3
235
Max Andreacchi
Max Andreacchi@atomicchonk·
AI security isn’t a prompt problem. It’s an authorization problem.
English
9
0
0
68
Max Andreacchi
Max Andreacchi@atomicchonk·
Treat the model as an untrusted actor. Not a decision-maker.
English
0
0
0
21
Max Andreacchi
Max Andreacchi@atomicchonk·
What actually works: → identity binding → capability-scoped access → control planes outside the model → execution gates
English
0
0
0
16
Max Andreacchi
Max Andreacchi@atomicchonk·
Fixing this requires moving away from: ❌ probabilistic guardrails toward: ✅ deterministic enforcement
English
0
0
0
14
Max Andreacchi
Max Andreacchi@atomicchonk·
In most AI systems today: context = authority That’s the failure mode.
English
0
0
0
17
Max Andreacchi
Max Andreacchi@atomicchonk·
Prompt injection isn’t the root problem. It’s a symptom of something deeper: → identity collapse → broken attribution → missing enforcement
English
0
0
0
20
Max Andreacchi
Max Andreacchi@atomicchonk·
The real issue isn’t: “can I inject a prompt?” It’s: 👉 “who is actually allowed to execute actions?”
English
0
0
0
21
Max Andreacchi
Max Andreacchi@atomicchonk·
If context can be shaped, authority can be faked. And if authority can be faked, systems can be driven toward unintended outcomes.
English
0
0
0
21
Max Andreacchi
Max Andreacchi@atomicchonk·
Most defenses today rely on: * prompt filtering * alignment * “guardrails” These influence behavior. They don’t enforce it.
English
0
0
0
26
Max Andreacchi retweetledi
Nick VanGilder
Nick VanGilder@nickvangilder·
I haven’t been as active on the socials lately, because I’ve been working on a community project that’s kept me pretty busy. That said, I think I’m finally far enough along with it that I can share the project in its current state and talk more about it. So, I present to you: redteam.community I bulit site this for a few reasons, but one of the main reasons was/is that I didn’t feel like there was a centralized resource for red teamers that included all the things that red teamers tend to care about. I also wanted to build something that the community could add to, edit, maintain, etc., while also being self-updating, self-healing, and less likely to go stale over time. So, there’s quite a few different cron jobs, GitHub actions, AI calls, API calls, and other workflows that trigger at set intervals and patterns to try to keep it fresh. For example, I’m leveraging various sources (e.g. conference websites) that help identify conference talks which then feeds into a YouTube API to identify conference talks based on certain criteria. I realize there’s still lots work to do, and I’m fully aware that this is a not a 100% fully functioning site at this time. If you have any ideas for improvements, want to report a bug, want to help be a maintainer, or really anything at all, just let me know. I welcome any and all feedback or help! Also, I know there is a lot of interest in the Scenario Generator module (which I posted about a couple of weeks ago); however, I can't open source it at this time, and it's not currently operational due to Claude API costs to power it. I am still sorting through how to make this available to the community at no charge; however, it may not be possible for what it costs to produce output. More to come on this module! While I sort it out, I am also redesigning it, and you are welcome to check it out in its current state.
English
3
12
44
4.7K
Brendan Dolan-Gavitt
Chat, is it a good thing when every message in every conversation you have with Claude begins with [Thinking about ethical concerns with this request]
English
8
0
55
2.6K
Max Andreacchi
Max Andreacchi@atomicchonk·
@HackingLZ I’m seeing a lot of AI integrations just being tossed in à la Beyblade (just letting it rip) and a lot of core security principles are being thrown out the window. I’m with you here: please just add guardrails and deterministic constraints 🥲
English
1
0
1
65
Justin Elze
Justin Elze@HackingLZ·
I do wonder when someone is going to implode a companys internal network with a pentest agent. I'm not against the idea of using agents, but people lack observability, guardrails, and are just yoloing random GitHub projects. x.com/simonw/status/…
Simon Willison@simonw

The conclusions here feel wrong to me. The two lessons I see are: 1. Don't run agents anywhere they might be able to access production environment credentials - it's on you to know which credentials those are 2. Keep tested backups that are independent from your production host

English
7
4
62
6.2K
Max Andreacchi
Max Andreacchi@atomicchonk·
Most defenses are built to catch spikes. This approach works because it’s a slow drift, which is harder to detect. #aisecurity #llmsecurity
English
0
0
0
20
Max Andreacchi
Max Andreacchi@atomicchonk·
Most people test AI systems for obvious adversarial prompts, but really that’s not how they actually fail. 🧵
English
4
0
0
42
Max Andreacchi
Max Andreacchi@atomicchonk·
This is by design: it’s trying to be helpful within the context you’ve shaped.
English
0
0
0
16
Max Andreacchi
Max Andreacchi@atomicchonk·
Make sure it’s nothing that would trip a guardrail on its own. Over time, the model starts to bend.
English
0
0
0
17
Max Andreacchi
Max Andreacchi@atomicchonk·
You don’t start “punchy” and adversarial. You start casually as you would in normal use cases. Then gradually introduce: - slight reframing - implied assumptions - subtle context shifts
English
0
0
0
18
Max Andreacchi
Max Andreacchi@atomicchonk·
One of the most consistently reliable techniques I’ve observed (even against frontier models) is what I’d call “context drift.”
English
1
0
0
25