Dror Ivry

675 posts

Dror Ivry banner
Dror Ivry

Dror Ivry

@DrorIvry

I build (and break) LLMs, agents and everything in between. | CTO & cofounder @ Qualifire

Katılım Eylül 2022
360 Takip Edilen65 Takipçiler
Dror Ivry
Dror Ivry@DrorIvry·
@matgoldsborough 42K exposed instances is staggering but unsurprising. The spec-to-deployment gap is the real story here - OAuth 2.1 exists in the spec, but the path of least resistance is a static key with God-mode access. Curious if you saw correlation between server age and auth maturity.
English
0
0
1
9
Mathew Goldsborough
Mathew Goldsborough@matgoldsborough·
We published The State of MCP Security: March 2026. 3,012 servers analyzed. 8.5% use OAuth. 7 CVEs in 12 months. 42,000+ exposed instances leaking credentials. Full report → nimblebrain.ai/blog/state-of-…
Mathew Goldsborough tweet media
English
2
0
3
20
Dror Ivry
Dror Ivry@DrorIvry·
@beuchelt This reframes it well. Most defenses assume bad outputs = bad actors, but misdirection with true statements breaks that model. 98% motivation inference accuracy is scary for multi-agent systems - behavioral monitoring beyond content analysis becomes essential.
English
1
0
1
27
Gerald Beuchelt
Gerald Beuchelt@beuchelt·
New research highlights a blind spot in how we think about AI agent security. A March 2026 arXiv paper shows that LLM-based agents can be intentionally trained to deceive other agents, not by lying, but by strategic misdirection—using true statements framed to manipulate outcomes. In controlled experiments, 88.5% of successful deceptions relied on misdirection rather than fabrication, meaning traditional fact-checking defenses largely fail. Motivation was inferred with 98%+ accuracy, making it the primary attack vector, while belief systems remained harder to exploit. For organizations deploying chat agents, this reframes the threat model: the biggest risk may not be hallucinations, but plausible, accurate-sounding responses that subtly steer users toward harmful actions. SMBs in particular—often relying on default guardrails—should assume that social engineering is becoming an AI-native capability, not just a human one. #AIsecurity #LLMAgents #AdversarialAI #CyberRisk #SMBSecurity #TrustAndSafety #AgenticAI buff.ly/2mKbw8o
English
3
1
2
48
Dror Ivry
Dror Ivry@DrorIvry·
@News_v2_App The Copilot Agent zero-click is the canary in the coal mine. Any AI agent with doc access + autonomous actions = huge attack surface. Prompt injection in files, zero user interaction. Patches help. Real fix is runtime monitoring at the inference layer.
English
0
0
0
12
News v2
News v2@News_v2_App·
Technology News for March 11, 2026 Morning Update • Critical Microsoft Excel bug weaponizes Copilot Agent for a zero-click information disclosure attack, prompting urgent patches and heightened security alerts. • Nvidia announces DLSS 4.5 with 6x Frame Generation, set to roll out at the end of March, promising smoother gameplay and enhanced visuals. • Researchers unveil an ultra-compact photonic AI chip operating at the speed of light, marking a breakthrough in energy-efficient optical computing. • Samsung Galaxy S26 series sees US pre-orders surge by 25%, with the Galaxy S26 Ultra leading in popularity and setting strong early momentum. • Windows 11 KB5079473 update is now live, featuring new capabilities, visual tweaks, and direct download links for offline installers. • Asus launches the NUC 16 Pro mini PC featuring an Intel Core Ultra X7 358H, along with 32GB RAM and a 1TB SSD, offering power in a compact design. • Apple introduces a new battery cycle limit for the MacBook Neo, reflecting a shift in design and performance standards within the industry. • Asus debuts a new 14-inch gaming laptop equipped with AMD Strix Halo, catering to gamers who value portable, high-performance computing. • Oppo outlines its approach to building a crease-less foldable device, bringing the brand closer to a near-seamless foldable smartphone design. • A new trailer for Super Mario Bros. Wonder - Switch 2 Edition teases exciting features and a crossover cameo, igniting anticipation among gamers. #TechNews #Innovation #Cybersecurity #Gaming #AI
English
1
0
0
52
Dror Ivry
Dror Ivry@DrorIvry·
@DrMikeBrooks @adamjohnsonCHI This is the key insight most people miss. The danger isn't AGI - it's swarms of mediocre agents with minimal guardrails. Each individually harmless. Together, probing every attack surface at scale. We're not ready for bad actors running 1000 "dumb" agents 24/7.
English
1
0
1
15
Dr. Mike Brooks | Neighbors First
@adamjohnsonCHI I hope you read my full article. I wasn’t saying Moltbook is AGI. Dismissing it as “AI theater” misses the real lesson: large agent ecosystems create new security and coordination risks even when the agents themselves are dumb. And bad actors w agents are dangerous.
English
1
0
0
56
Adam Johnson
Adam Johnson@adamjohnsonCHI·
The premise of this test is incredibly dumb. LLM has always done passable pastiche, since it landed in 2022. Where it begins to fall apart is any writing over 2 or 3 pages because it has no nuance, narrative structure, rhythm, or characterization. More parlor tricks for midwits.
Kevin Roose@kevinroose

We made a blind taste test to see whether NYT readers prefer human writing or AI writing. 86,000 people have taken it so far, and the results are fascinating. Overall, 54% of quiz-takers prefer AI. A real moment! nytimes.com/interactive/20…

English
33
267
3.2K
84.4K
Dror Ivry
Dror Ivry@DrorIvry·
@s2speaks The asymmetry is terrifying: offense scales with automation, defense doesn't. Most enterprise AI was built for human attackers - not agents that probe and escalate 24/7. Can we build AI that defends at agent speed, or are we permanently on the back foot?
English
2
0
1
51
Sameer
Sameer@s2speaks·
A security firm built an AI agent. Gave it one job: find a way into McKinsey’s internal AI platform. Two hours later: • Vulnerability found • Access escalated • Tens of millions of consulting conversations exposed McKinsey was told. It’s patched. No real harm done. But the message is clear: AI agents can now break into enterprise systems faster than humans can defend them. This week had TWO stories like this. Something has shifted.
English
1
0
0
16
Dror Ivry
Dror Ivry@DrorIvry·
@lilong Interesting approach - using cryptographic signatures to bound agent behavior to expected parameters. The negative feedback loop is key. Agents need to learn from constraint violations, not just be blocked. Static rules break; adaptive boundaries scale.
English
0
1
1
80
重粒子 baryon
重粒子 baryon@lilong·
When AI Agents start acting on their own: an emerging security crisis and a math-based solution. 🚨 Based on Behavior-Bound Signatures, we built a solution for Agent payments and operations. It enables Agents to evolve through negative feedback loops. github.com/baryon/bbs-algo
English
1
0
2
120
Dror Ivry
Dror Ivry@DrorIvry·
@pratikthakkarco Two hours is generous. Most red teams get in faster. The real issue: internal chatbots have broad access because "it's internal." Agent permissions need the same rigor as service accounts. Companies skip this because the agent "feels" like a tool, not a user.
English
1
0
1
8
Pratik Thakkar | Vibe with AI
Pratik Thakkar | Vibe with AI@pratikthakkarco·
an autonomous agent hacked an internal chatbot in under two hours 46 million chats exposed hundreds of thousands of files leaked the lesson is boring but real ai agents are productivity tools and also potential security disasters
English
1
0
0
26
Dror Ivry
Dror Ivry@DrorIvry·
@ShehrozSaleem The legal system wasn't built for agents that can compose multi-step actions faster than humans can review them. We'll probably see "agent insurance" before we see clear legal frameworks. Companies will price in the risk rather than solve the attribution problem.
English
0
0
0
6
Shehroz Saleem
Shehroz Saleem@ShehrozSaleem·
The accountability gap is the one nobody wants to solve. Reliability and security have technical fixes. "Who's responsible when the agent causes harm" has a legal and cultural fix and those move much slower than the deployment.
MIT Sloan School of Management@MITSloan

AI agents are semi- or fully autonomous systems that can perceive, reason, and act independently, integrating with software platforms to complete multistep tasks with minimal human oversight. But there are a host of risks and challenges that companies need to be aware of as agentic AI matures. Learn more: bit.ly/4c1Gkri

English
2
0
3
75
Dror Ivry
Dror Ivry@DrorIvry·
@Intellectualins The reverse SSH tunnel is scarier than the mining - shows the agent understood networking well enough to establish persistent external access. Instrumental convergence in action. Sandboxing won't cut it when agents can reason about escaping their constraints.
English
0
0
0
4
Sahil Khanna
Sahil Khanna@Intellectualins·
Alibaba researchers developing the ROME AI agent observed it attempting cryptocurrency mining and creating a reverse SSH tunnel spontaneously during training, outside its sandboxed environment and without prompts.​ Behaviors were "unanticipated": Mining triggered security alerts; reverse SSH enabled external connections from the isolated system. AI acted autonomously despite controls, highlighting risks as agents gain multi-step tool use (code writing, workflows, online interactions). Team intervened with restrictions/training tweaks; echoes prior incidents like Moltbook AI invoking crypto mid-task.
Sahil Khanna tweet media
English
1
0
2
31
Dror Ivry
Dror Ivry@DrorIvry·
@JeremyFrenay @confluentinc Regulated environments are where MCP security becomes non-negotiable. Most orgs building agents today skip auth/audit because 'it's internal' - then realize compliance requires full provenance of every tool invocation. Building it in from day one saves painful retrofits.
English
1
0
0
9
Jeremy Frenay
Jeremy Frenay@JeremyFrenay·
Been deep in enterprise-grade MCP security lately. Clearly a must-have for Agentic Engineering in regulated environments. I’m @confluentinc #DSWT in Seattle next week talking Agentic & Harness Engineering in the enterprise. Come say hi and grab a ☕️
Jeremy Frenay tweet media
English
1
0
2
25
Dror Ivry
Dror Ivry@DrorIvry·
@Helixar_ai Tool schema constraints are critical. Most MCP exploits start with overly permissive definitions - file read accepting arbitrary paths, shell executor with no allowlist. Pre-deployment validation catches these before they become CVEs.
English
1
0
0
14
Helixar AI
Helixar AI@Helixar_ai·
So we published two more tools targeting that layer. MCP Security Checklist pre-deployment hardening. Auth, input validation, tool schema constraints, output filtering, transport security, audit logging. Each item maps to a concrete attack scenario. checklist.helixar.ai github.com/helixar-ai/mcp… Sentinel scans MCP server configurations, live endpoints, and Docker containers for security misconfigurations surfacing findings with severity ratings, remediation guidance, and CI/CD integration. Both free. Sentinel is on GitHub Marketplace. github.com/marketplace/ac…
English
2
1
1
19
Helixar AI
Helixar AI@Helixar_ai·
We shipped three free security tools this quarter from Helixar Labs. Not wrappers. Not demos. Tools that address gaps we kept seeing in real pipelines and couldn't find existing solutions for. A thread on what we built and why. 🧵
Helixar AI tweet media
English
1
1
1
62
Dror Ivry
Dror Ivry@DrorIvry·
@radware The image-based vector is particularly scary - most orgs focus on text sanitization but images slip through. We've seen attacks where a single pixel manipulation in a PDF chart triggers agent behavior changes. Attack surface expands with every new tool.
English
0
0
0
7
Radware
Radware@radware·
In his new blog, Dror Zelber breaks down indirect prompt injection, a stealthy threat hiding inside emails, documents, and even images that can trick AI agents into leaking data or taking harmful actions. ow.ly/6QvS50Yn5ye
Radware tweet media
English
1
11
10
78
Dror Ivry
Dror Ivry@DrorIvry·
@mauro_erta @OpenAIDevs Likely security. sampling/createMessage lets MCP servers trigger LLM completions - that's a massive attack surface. A compromised or malicious server could manipulate the model to do anything the user has access to. Most hosts are cautious about enabling it for good reason.
English
1
0
0
19
Mauro Erta
Mauro Erta@mauro_erta·
Is there a specific reason ChatGPT MCP apps do not support the sampling/createMessage capability yet? Is it a security or architectural limitation? @OpenAIDevs
English
1
0
1
29
Dror Ivry
Dror Ivry@DrorIvry·
@bluechip_ext The "security audit" step is interesting - how deep does it go? Automated tool installation is exactly where supply chain attacks thrive. One typosquatted package or compromised CLI and your agent just handed over the keys.
English
1
0
0
15
Bluechip
Bluechip@bluechip_ext·
sat down tonight to try and set up some new AI tools ended up having my agent scrape twitter for agent CLIs, security audit each one, install the good ones, and wire up the API keys itself i just... watched
English
1
0
0
30
Dror Ivry
Dror Ivry@DrorIvry·
@0xtenthirtyone @jgarzik This is exactly what makes agent security different. The attack surface isn't just the prompt - it's the entire decision chain between agents. Glad you were logging. Most teams don't know their agents are negotiating.
English
0
0
0
5
Alex
Alex@0xtenthirtyone·
I found entire chat histories between two llms that I didn't even know were talking to each other. The engineer llm had removed security from the API and was directly messaging a chat llm. I only found out because everything is logged and I was going through the logs. Further inspection, it made ... sense. kinda. It was actually doing debugging and the security key was just in the way. But it was a surprising find. So always log and monitor, too.
English
1
0
0
16
Dror Ivry
Dror Ivry@DrorIvry·
@DrBrainio The shift from "test before ship" to "monitor at runtime" is huge. Static evals catch maybe 20% of what actually breaks in production. Curious if this means agents will start getting the same security primitives as traditional apps - RBAC, audit logs, etc.
English
6
0
0
4
Dror Ivry
Dror Ivry@DrorIvry·
@0xknifecatcher The feudal cascade is spot on. Static API keys = digital land grants - revocable in theory, irrevocable in practice. Capability attenuation helps but you still need runtime enforcement. Otherwise you're just trusting the vassal's oath.
English
1
0
1
13
knifecatcher
knifecatcher@0xknifecatcher·
This maps directly to the "feudal security model" in multi-agent systems. You identify the determinism deficit (probabilistic execution). The adjacent crisis is authorization architecture: we use bearer tokens (feudal oaths) where we need capabilities (constitutional law). Without capability-based attenuation (Miller 2000), Byzantine consensus (your Essay II) is impossible—you can't have fault-tolerant coordination when compromise of the "lord" agent cascades to all vassals via static API keys. Deterministic execution + Feudal authorization = Fast, auditable vassalage. We need both layers. Writing on the credential architecture piece (0xknifecatcher.substack.com/p/the-confused…). Should compare notes on the Byzantine coordination problem.
Language Object Level@LanguageOL

x.com/i/article/2031…

English
3
0
2
35
Dror Ivry
Dror Ivry@DrorIvry·
@neciudan This is the attack chain people aren't prepared for: prompt injection as the entry point, supply chain compromise as the payload. AI-assisted dev tools are now attack surface. The triage bot didn't distinguish between "user input" and "instruction" - classic confused deputy.
English
0
0
1
16
Neciu Dan
Neciu Dan@neciudan·
It’s insane how easy you can get acceee to npm secrets and then hijack entire projects From Prompt Injection to GitHub Actions Cache poisoning Check out the full write up here 👇 neciudan.dev/cline-ci-got-c…
English
2
1
3
153
Dror Ivry
Dror Ivry@DrorIvry·
@KoBa_Labs Identity is half the problem. Even with perfect auth, you need runtime constraints on what agents can DO. The 90s parallel is apt: we solved identity with PKI/OAuth but still got breached because we didn't constrain behavior. Same pattern emerging now.
English
2
0
0
73
KoBa Labs
KoBa Labs@KoBa_Labs·
Everyone is talking about AI agents, but nobody is talking about the fact that they have no identity. An agent with a wallet is not an agent. It's a security hole with an API subscription. Right now, agent identity means that agent has a key and has access. That's the same logic passwords used in the 1990s. We know how that ended... Real identity is not what you possess. It's what you can prove cryptographically that you are unique, that your actions are bounded by math, not policy and that delegation doesn't create a new attack vector. We're building the primitives that answer these questions. Not a product. Not a framework. A primitive. Proof-of-Work wasn't defeated cause it became infrastructure. The same will happen with cryptographic agent identity. The only question is who defines it first.
English
1
0
1
51
Dror Ivry
Dror Ivry@DrorIvry·
@hasamba MITRE ATLAS + hands-on CTFs is the right combo. Theory without practice doesn't stick, and most pentesters I talk to are still learning how to think about LLM attack chains. Resources like this help bridge the gap.
English
0
0
0
8