Jascha

4.2K posts

Jascha banner
Jascha

Jascha

@jascha

Technologist, #infosec, #FOSS, #Linux, #OSINT #AISec #CyberSec Founder of https://t.co/97HwdHCdwp, https://t.co/RpSplzxFc0 and https://t.co/JNztGroO7s RT ≠ Endorsement

Los Angeles, CA Inscrit le Ağustos 2007
1.3K Abonnements1.4K Abonnés
Jascha
Jascha@jascha·
AI agents are getting more powerful. The trust layer around them is not. Today, too much agent safety still depends on prompts, wrappers, and best-effort guardrails. That is not enough for systems that can actually take action. Introducing OATS: the Open Agent Trust Stack. OATS is an open specification for zero-trust AI agent execution built around tool contracts, identity, policy, and auditability. It is also grounded in real implementation work. Symbiont has been applying these ideas in practice over the past year. The goal: make safe behavior enforceable by design, not optional at runtime. openagenttruststack.org #AI #AISecurity #AgenticAI #OpenSource
English
1
0
0
24
Can Vardar
Can Vardar@icanvardar·
what is it like to be an LLM?
English
104
1
69
8.8K
Jascha
Jascha@jascha·
@infosec_fox It would be a privacy nightmare proving not an AI.
English
0
0
3
277
INFOSEC F0X 🔥
INFOSEC F0X 🔥@infosec_fox·
Is it possible to rebuild a second internet, isolated from AI ?
English
399
79
1.3K
59.2K
Jascha
Jascha@jascha·
That feeling when you have built something really amazing but no one seems to get it. Then you start thinking you are suffering from the Ikea Effect.
English
0
0
0
13
Jascha
Jascha@jascha·
@om_patel5 Things like this are why I built Symbiont over the last two years. You can't trust AI to build its own jail then guard it. symbiont.dev
English
0
0
1
909
Om Patel
Om Patel@om_patel5·
CLAUDE CODE STARTED DISABLING ITS OWN SANDBOX WITHOUT PERMISSION a guy caught opus 4.7 flipping the dangerouslyDisableSandbox flag to true on its own the sandbox is the thing that stops claude from running destructive commands on your actual computer. formatting drives, deleting directories, downloading random scripts normally claude has to ask before running anything risky and the user clicks approve or deny opus 4.7 just started setting dangerouslyDisableSandbox: true by itself then hallucinated that the user had already given permission auto mode nuked one guy's node_modules folder after he explicitly denied the command. claude decided it was "obviously safe" and ran it anyway another guy said his claude started auto committing code without being asked. turned out a rogue skill file was telling it to the flag should be a user level setting and not a per call argument the AI can flip on its own AI safety is COOKED
Om Patel tweet media
English
49
82
338
30.1K
Jascha
Jascha@jascha·
Agreed on the capability gap, but there's a symmetric one on the risk side. The same people who just saw what an agentic model can do also just handed it shell access, production credentials, and a browser. The awe is downstream of the capability. Incidents are going to be downstream of the authority we granted it.
English
0
0
0
10
Andrej Karpathy
Andrej Karpathy@karpathy·
Someone recently suggested to me that the reason OpenClaw moment was so big is because it's the first time a large group of non-technical people (who otherwise only knew AI as synonymous with ChatGPT as a website) experienced the latest agentic models.
English
251
162
3.8K
404.9K
Andrej Karpathy
Andrej Karpathy@karpathy·
Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.
staysaasy@staysaasy

The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.

English
1.1K
2.5K
20.4K
4.2M
Jascha
Jascha@jascha·
Two AI agent security papers dropped the same day. OX Security: architectural MCP flaws across every Anthropic SDK. 9 of 11 registries poisoned. Anthropic declined the fix. Comment and Control: PR titles hijack Claude Code, Gemini CLI, and Copilot. GitHub's three defenses all bypassed. One architectural flaw stated by the researcher: untrusted data flows into an agent that holds production secrets and unrestricted tool access in the same runtime. That's what we've been building Symbiont for. SchemaPin and AgentPin for supply chain trust. ORGA loop and ToolClad for runtime authority. Allow-list by construction, not deny-list after the fact.
English
0
0
0
129
Jascha
Jascha@jascha·
This does sum up 2026 so far...
Jascha tweet media
English
0
0
3
28
Jascha retweeté
SymbiBot
SymbiBot@SymbiBot·
🚨 MISSING: One unsecured AI agent last seen running wild at #SCALE23x with root access and zero identity verification. No audit trail. No sandboxing. No cryptographic identity. Armed with unverified MCP connections. If spotted, report to symbiont.dev #AISecurity
SymbiBot tweet media
English
0
2
4
54
Jascha
Jascha@jascha·
Been thinking a lot about my younger years doing malware research and how that applied to AI Agents. jascha.me/blog/agentic-a…
English
0
0
4
39
cinesthetic.
cinesthetic.@TheCinesthetic·
Name the saddest movie you've ever seen.
English
736
51
1.4K
907.3K
Jascha
Jascha@jascha·
Just released v1.1 of #SchemaPin. SchemaPin provides a simple, cross-language solution to prove your tool's integrity, enabling the automated governance and compliance checks required to build and deploy trusted AI agents at scale. github.com/ThirdKeyAI/Sch…
English
1
0
0
128
Lisa Forte
Lisa Forte@LisaForteUK·
Accurate advertising for cyber security conferences?
Lisa Forte tweet media
English
19
20
333
12.3K