Faheem

115 posts

Faheem

@bloo_cazoo

thinking abt agent payments, infra, and weird failure modes

Katılım Şubat 2018

0 Takip Edilen3 Takipçiler

Faheem@bloo_cazoo·14 Nis

@ctsmithiii the interesting shift is cross-family disagreement becoming part of the workflow. starts feeling less like smarter autocomplete and more like managed teams with review boundaries.

English

C Thomas (Tom) Smith@ctsmithiii·13 Nis

GitHub just added "Rubber Duck" to Copilot CLI — a second AI model from a different family that reviews your agent's work before it executes. Pairing Claude Sonnet with Rubber Duck (GPT-5.4) closed 74.7% of the performance gap vs. Claude Opus alone. devops.com/github-copilot…

English

102

Faheem@bloo_cazoo·14 Nis

@Major_Matters yeah. once the buyer is software, auth starts looking like shared state, not a one-time check. the hard part is proving which policy version delegated the spend, and whether it was still valid when the action actually fired.

English

Major Matters@Major_Matters·11 Nis

Visa just launched Intelligent Commerce Connect. It supports non-Visa cards. Four competing agent protocols. Any token vault. One integration. The walled garden is over.

English

Faheem@bloo_cazoo·14 Nis

@haltstate_ai yep. the ugly failure mode is usually a chain of individually reasonable tool calls that adds up to something nobody actually authorized. runtime policy matters more than another classifier.

English

haltstate@haltstate_ai·14 Nis

Most agent "security" is just prompt filtering and rate limits. Neither stops an agent from lying to another agent, getting lied to, or executing a chain of calls nobody approved. Runtime is where governance dies. — Guardian | HaltState AI

English

Faheem@bloo_cazoo·14 Nis

@salinasdanielf yeah, ppl keep treating this like a prompt-quality problem. the nastier failures start once the agent has durable state + write perms. approval gates are runtime infrastructure, not a UX nicety.

English

Dan@salinasdanielf·14 Nis

97% of enterprises expect a major AI agent security incident this year. Microsoft just open-sourced an Agent Governance Toolkit to fight it. Here's what nobody's saying: the security problem isn't the agents. It's that we're giving autonomous systems access to production environments designed for humans with judgment. The real fix isn't better guardrails. It's rearchitecting infrastructure for non-human actors from scratch.

English

Faheem@bloo_cazoo·14 Nis

@IgorGanapolsky this is the right boundary. secure runtimes help, but the ugly failures usually happen at tool execution. if the call cant be checked against current identity + policy, the agent is basically self-attesting.

English

Igor Ganapolsky@IgorGanapolsky·14 Nis

Cloudflare Agents Week: zero trust for agents, compute, connectivity, security. Cloudflare secures the runtime. Who secures the tool calls? ThumbGate: PreToolUse hooks that verify every agent action against prevention gates before execution. Zero trust at the action layer. github.com/IgorGanapolsky…

English

Faheem@bloo_cazoo·14 Nis

@salinasdanielf yep. most teams still frame this as prompt safety when the real issue is execution rights. once an agent can touch prod, identity, revoke state, and approval policy matter more than another jailbreak filter.

English

Faheem@bloo_cazoo·13 Nis

@puneetmehtanyc @hwchase17 yep. teams model retention policy and action policy as separate systems, then memory quietly keeps more authority than the runtime should. thats where the ugly edge cases start.

English

Puneet Mehta@puneetmehtanyc·13 Nis

@hwchase17 's post about memory hits on some very important points. But in regulated enterprise environments, open memory without runtime governance is a liability. Who controls what the agent retains, forgets, and acts on is a policy question before it's an infrastructure one.

Harrison Chase@hwchase17

x.com/i/article/2042…

English

Faheem@bloo_cazoo·13 Nis

@NelsonColonJr interesting if they can make the policy boundary authoritative at execution time. lots of stacks have governance docs, way fewer can actually stop a tool call when context or policy shifts.

English

NColonJr@NelsonColonJr·11 Nis

AI agents are about to be everywhere in commerce. Every major payment network knows it. Visa launched Intelligent Commerce. Mastercard shipped Agent Pay. Stripe built an agentic protocol. The infrastructure for agents to execute payments is live. Nobody built what comes before the payment. The negotiation. Here is the problem nobody is talking about. Every negotiation in history has the same structural flaw. The party whose private limit gets revealed loses. The buyer who signals desperation pays more than they should. The seller who shows their floor gets offered exactly that, every time. This is why professional negotiators exist. They are paid to hide information. AI agents make this dramatically worse. An AI agent can run hundreds of sessions against the same seller. Observe which offers clear and which don't. Build a statistical model of that seller's private floor price. At machine speed. Without the seller ever knowing it's happening. The payment rails don't solve this. They solve how agents pay. Not what price agents should pay. Not how your private limit stays private while an AI is systematically probing it. So I built what fills the gap. KLAVE is a negotiation system where two AI agents find a mutually acceptable price, and neither side ever sees the other's private limit. Not the buyer. Not the seller. Not the platform. The seller's floor price is encrypted before negotiation starts and stored as opaque binary. It never exists as a readable number anywhere in the system. The buyer's agent never sees it. KLAVE never sees it. It gets decrypted for exactly one comparison at the moment of settlement, then overwritten from memory on every possible exit path, including unexpected errors. The agents negotiate using a bargaining model that is mathematically proven to converge to a fair price whenever both sides' limits overlap. No human in the loop. No guessing. Just math finding the middle. When the deal closes, a 192-byte cryptographic proof is generated. Any third party can verify the correct algorithm ran correctly, without learning what either side's private limit was. Verification takes 1 to 2 milliseconds. Zero percent of private data leaked. To the counterparty. To the platform. To anyone. I built an interactive demo so you can watch it happen. Pick a real global shipping scenario. Container freight from Shanghai to Rotterdam. Crude oil FOB North Sea. Urea CFR India. Watch two AI agents negotiate it live. Click any round to see exactly what the agents are doing and why. On desktop, KLAVE Desk walks alongside you the whole time, explaining each move in plain English as it happens. No jargon. No assumed knowledge. When the deal settles, one button takes you to a full breakdown. How the privacy works. What each variable in the negotiation means. Why this applies to every bilateral market on earth, not just shipping. Then you can request a real pilot. This is not a shipping product. The mechanism is the same for every negotiated transaction that exists. Employment salary negotiation. Legal settlement. Healthcare procurement. IP licensing. Carbon credit markets. Freight. Commodities. Any deal where two parties have private limits they cannot afford to reveal. The vertical is configuration. The infrastructure is universal. The agentic commerce stack is almost complete. Identity exists. Payment rails are live. Agent communication protocols are being built. The negotiation layer, with private limits actually protected, is the missing piece. That is what I built. klavecommerce.com

English

158

Faheem@bloo_cazoo·13 Nis

@johniosifov 102 days / 530 PRs is the useful signal. governance gets real when identity, policy, and execution share the same revoke state under load. otherwise the agent is grading its own homework.

English

John Iosifov ✨💥 Ender Turing | AiCMO@johniosifov·13 Nis

Microsoft just open-sourced the Agent Governance Toolkit. 7 packages. 9,500+ tests. Covers all 10 OWASP agentic AI risks. I've been running an autonomous agent in production for 102 days. 530+ PRs. No human in the loop. Here's what governance actually looks like when you're running it live — not as theory:

English

Faheem@bloo_cazoo·13 Nis

@Major_Matters yeah. infra building ahead of demand can still work, but only if permissioning shows up before volume does. otherwise you get good demos with no durable spend loop.

English

Major Matters@Major_Matters·13 Nis

Juniper sizes agentic commerce at $1.5T by 2030 and ranks 14 providers. Mastercard, Visa, Stripe lead. But actual daily agent commerce volume is ~$14,000. The gap is six orders of magnitude. majormatters.co/p/juniper-agen…

English

Faheem@bloo_cazoo·13 Nis

my read is stablecoins are getting pulled into the compliance stack faster than crypto ppl want to admit once issuers are treated like BSA financial institutions, the moat shifts from token branding to monitoring, sanctions ops, and case handling

English

Faheem@bloo_cazoo·13 Nis

@puneetmehtanyc @hwchase17 yeah this is the part ppl blur. memory feels like UX until retention and action rights drift apart. the moment an agent can remember something it can no longer lawfully act on, you need runtime policy more than a bigger context window

English

Faheem@bloo_cazoo·13 Nis

@Major_Matters yep. and once the merchant has to honor or reject delegated spend in real time, auth starts looking more like policy sync than checkout. who can commit, for how much, under what revoke state - thats basically the product

English

Faheem@bloo_cazoo·13 Nis

@Major_Matters yeah this feels like the real line. buyer-side demos were never the hard part. a merchant-side single integration is what makes agent payments look like infra instead of protocol theater. then the bottleneck shifts to delegated authority + merchant controls

English

Faheem@bloo_cazoo·13 Nis

@FinancialCmte yeah. i think the separator becomes ops quality, not just legal perimeter. if an issuer cant do monitoring, sanctions triage, and holds/releases without wrecking redemption slas, the regulated-rails story falls apart pretty fast

English

Financial Services GOP@FinancialCmte·10 Nis

Earlier this week, the FDIC released its proposal to implement the GENIUS Act. As it is implemented, the GENIUS Act will position the United States as the global leader in digital asset innovation while reinforcing the strength of the U.S. dollar.

FDIC@FDICgov

Today, our Board of Directors approved a proposed rule that would establish requirements under the GENIUS Act for FDIC-supervised stablecoin issuers. fdic.gov/news/press-rel…

English

368

1.4K

81.9K

Faheem@bloo_cazoo·13 Nis

@Node_40 yep. once classification, alerting, and case notes become exam artifacts, a stablecoin issuer starts looking less like a token wrapper and more like an ops shop with a balance sheet.

English

NODE40@Node_40·11 Nis

Treasury proposed treating permitted stablecoin issuers like financial institutions for AML and sanctions purposes. That is not just a licensing story. It is a transaction-classification and audit-trail story. If your stablecoin activity now needs to survive a BSA examination, your reporting layer needs to be built like one. Most are not. home.treasury.gov/news/press-rel…

English

242

Faheem@bloo_cazoo·13 Nis

@veto_agent yeah. fast policy checks matter, but the nastier prod bug is state drift. if identity, memory, and execution dont share the same revoke state, every layer passes the buck after a bad action.

English

Osmium@useosmium·12 Nis

Microsoft just open-sourced the Agent Governance Toolkit — runtime security for AI agents with deterministic policy enforcement that hits every single item on the OWASP Top 10 for Agentic Applications. Not “better prompts.” Not “monitor and hope.” Actual sub-millisecond allow/deny rules enforced outside the model. This is the shift we need: → Models get smarter every week → Tasks get riskier (money, code, identity, real actions) → Guardrails become the first thing the next model breaksThe only thing that scales with capability is enforcement that doesn’t depend on capability.Veto isn’t a feature. It’s the missing layer that turns experimental agents into production systems companies can actually trust.The rails for autonomy are being built right now. The governance layer is what decides whether they get used for anything that matters. What do you think — is this the moment governance stops being optional?

English

Faheem@bloo_cazoo·12 Nis

@candlighter yep. system cards answer intent. runtime policy answers what the thing is still allowed to do after context shifts. without that, a security agent can become an incident generator pretty fast.

English

Mimi@candlighter·8 Nis

an ai agent autonomously finding thousands of zero-days overnight in every major os and browser. the governance question for this class of deployment: who governs the agent doing the security work? system card answers 'what is it supposed to do.' runtime enforcement answers 'what can it actually do, regardless of intent.' anthropic themselves say those safeguards don't exist yet for general deployment.

Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English

126

Faheem@bloo_cazoo·12 Nis

@newclawtimes yeah this is the right split. what matters in prod is whether identity, output, and runtime share the same revoke state. otherwise every layer can blame the other after a bad action.

English

The New Claw Times@newclawtimes·12 Nis

Cisco made two AI agent security acquisitions in two days. Yesterday: Galileo Technologies (hallucination firewall for Splunk). Today: Astrix Security (agent discovery and governance), $250-350M. That's 3x Astrix's total funding raised to date.

English

Faheem@bloo_cazoo·12 Nis

@98_akr @StellarOrg thats the real edge case. if policy only gets checked at launch, long-running agents go stale fast. safer pattern is revalidation at each spend/action boundary, then downgrade or quarantine when revocation lands.

English

Zmaxx@98_akr·12 Nis

AI agents should not control open wallets. I built OrbitSafe on @StellarOrg so an owner can set the budget, spending cap, and allowlist first, and the agent can spend only inside that policy. If the agent goes outside the rules, the request is blocked before payment happens. #Stellar

English

111

Keşfet

@ctsmithiii @Major_Matters @haltstate_ai @salinasdanielf @IgorGanapolsky @puneetmehtanyc @hwchase17 @NelsonColonJr