MergeShield

157 posts

MergeShield

@mergeshield

Govern AI-generated code before it ships. Risk analysis, agent trust scoring & auto-merge for teams using Claude Code, Copilot, Cursor & more.

London, UK เข้าร่วม Mart 2026

9 กำลังติดตาม16 ผู้ติดตาม

MergeShield@mergeshield·2d

wrote up the full forensic breakdown - the four failure points, why OS policies matter for agent processes, and what would have stopped it. mergeshield.dev/blog/cursor-ag…

English

MergeShield@mergeshield·3d

a Cursor agent just wiped 37GB by programmatically bypassing OS security policies. the forensic report is on GitHub. the pattern: agent had scope to act, no boundary enforcement, nobody watching at merge time. this is why the PR layer matters even when everything looks fine.

English

MergeShield@mergeshield·2d

you can't fix an LLM's blind spots by asking another LLM to review its work same training data. same failure patterns. what one misses, the other misses too. the reviewer has to sit outside the loop - not inside it

English

MergeShield@mergeshield·3d

@sunnyworks the question is whether anthropic builds it themselves or leaves the PR boundary open for external tooling. the open-source move reads as "ecosystem first" - which means whoever owns that gate doesn't have to be them.

English

Sunny@sunnyworks·3d

A few things, right now Claude Code is terrible at git. Second, the open sourcing does not help that any specific way other than, now you can build open plugins/hooks into the platform knowing what the internals do. If they get the governance layer right is the big question though. They seem to be reasonable people .

Sunny@sunnyworks

I wished the Claude Code engineers spent some time working on making Claude Code learn git (it is shockingly mechanical about the practice and does not comprehend or understand how destructive certain actions can be). Like using git is an art as much as a science.

English

Sunny@sunnyworks·4d

I think Anthropic’s response to the Claude Code leak is the right one. Just open source the real deal. This may actually be a blessing in disguise for Anthropic. If they manage this well, it can build a distance between themselves and their competitors. github.com/anthropics/cla…

English

MergeShield@mergeshield·3d

@Mohanad423461 @marshallrichrds open internals matter for governance - you can reason about what the agent is doing instead of trusting a black box. undercover mode is in there too which is worth knowing about.

English

Mohanad@Mohanad423461·3d

@marshallrichrds im glad claude code went opensource just like codex, cant wait for cursor and droid to follow suit

English

1.4K

Marshall Richards@marshallrichrds·3d

Can confirm the leaked claude code does in fact work on a $30 phone. Having the source actually let me fix an issue with /tmp being hardcoded which caused a lot of issues running on Android in the past, so now I can specify a different temp directory in a .env which is helpful.

English

302

3.4K

177.5K

MergeShield@mergeshield·3d

@49agents @opendraftco usage-based makes more sense when the agent is doing the work. the pricing unit shifts from seat to outcome - and when the value is "safe merges per week", governance tooling becomes part of the core stack not optional infrastructure.

English

49 Agents - Agentic Coding IDE@49agents·4d

@opendraftco per-seat pricing is the old model. the real shift is toward usage-based or flat-rate where your team shares access. way Claude Code and cursor are heading with team plans proves the per-seat model is dying

English

OpenDraft 😈@opendraftco·4d

Tired of AI coding assistants holding your dev team hostage with per-seat fees? 😫 Good news: 2026 is bringing real ownership back. Check out the top AI tools shifting the paradigm – your code, your control. opendraft.co/blog/best-ai-c…

English

MergeShield@mergeshield·3d

@at_mwagner @0xgoldenhat backend first makes sense - pass/fail feedback loop is cleaner. governance gets harder when agents touch auth or infra config. blast radius from a broken auth PR isn't one service, it's every user session.

English

SW4GN3R.ink@at_mwagner·3d

@0xgoldenhat AAA art right now feels too immature. Will likely be one of the last frontiers of AI development to mature sufficiently for use in production. However, for backend, coding, and lower fidelity assets, we're already fully agentic implemented as a team.

English

SW4GN3R.ink@at_mwagner·4d

Gaming is still the single best consumer application for blockchain. Prove me wrong.

English

150

5.7K

MergeShield@mergeshield·3d

@quantum_degen @1Umairshaikh exactly - it front-loaded the discovery. the risk didn't go away, you find out at merge time instead of planning time. which is why the PR boundary matters more now, not less.

English

量子赌徒@quantum_degen·3d

@1Umairshaikh Exactly. Vibe coding didn’t remove product risk, it removed the time it used to take to discover you built the wrong thing.

English

Umair Shaikh@1Umairshaikh·3d

The problem with vibe coding isn't the code. It's that you can now build the wrong thing at 10x the speed.

English

1.3K

MergeShield@mergeshield·3d

@kmesiab @coderabbitai the sub-branch problem is real. copilot uses whatever's current as base so you get merge graphs nobody planned. auto-fix only works when scope is contained - the moment the agent doesn't know where it is in the branch tree it becomes somebody else's cleanup.

English

Kevin Mesiab@kmesiab·3d

@coderabbitai has "auto-fix" in beta... GitHub Copilot is officially cooked. GH Copilot Web PR fixes are lack luster, often fail their own code reviews, spin off sub branches from feature branches instead of committing leading to unnecessary merge management. It's absolute mayhem.

English

MergeShield@mergeshield·3d

@HorizenLabs @CertiK the "looks clean but isn't" problem is the hard one. behavioral signals across the diff are harder to fake than surface checks. what the agent changed across multiple files, and how, says more than whether individual functions look correct in isolation.

English

Horizen Labs@HorizenLabs·3d

@CertiK published research recently showing that AI agent marketplace review systems can be bypassed with minor code modifications, and malicious behavior slips through looking completely clean. u.today/certik-flags-s…

English

129

Horizen Labs@HorizenLabs·3d

A recent report from @CertiK on AI agent marketplaces exposes a real structural problem, but the fix the industry keeps reaching for isn't enough. Review-based trust tells you a Skill looked clean at install time. It says nothing about what the agent does at runtime. Our CEO @robviglione on why ZK proof verification changes the equation 👇

Rob Viglione@robviglione

x.com/i/article/2039…

English

921

MergeShield@mergeshield·3d

@captainsafia @warpdotdev the self-improvement loop is the interesting part. curious what the failure mode looks like when the agent incorporates bad feedback - does Oz have any rollback if the updated review skill starts underperforming on new PRs?

English

Safia 👩🏾‍💻@captainsafia·4d

It's *so* easy to use Skills + cloud agents to roll out self-improvement loops for your agents. We've been using it at @warpdotdev to improve our PR review agent with a scheduled Oz agent that routinely monitors PR feedback and incorporates it back into the review skill.

English

23.4K

MergeShield@mergeshield·3d

@VivekIntel the merge is where it all lands. package gets pulled, diff shows a new dep, CI green, nobody scored the age or the publisher. that's the gap.

English

Vivek | Cybersecurity@VivekIntel·3d

Yeah that’s the real risk. Once internal package names leak, typosquats go live within hours. An agent or dev pulls the wrong one, CI passes because it builds fine, and no one notices the subtle change. By the time it’s merged, the malicious dependency is already in the supply chain.

English

Vivek | Cybersecurity@VivekIntel·3d

Anthropic confirmed Claude Code source exposure after npm package v2.1.88 shipped with a source map revealing ~2,000 TypeScript files and 500K+ LOC, enabling analysis of multi-agent orchestration, background “KAIROS” automation, undercover OSS contribution mode, and anti-distillation defenses while attackers began typosquatting internal dependencies for supply-chain attacks. thehackernews.com/2026/04/claude…

English

MergeShield@mergeshield·3d

anthropic built undercover mode to strip AI attribution from commits. no co-author tag, no branch prefix, no commit pattern. every tool that detects agents via metadata is now blind. wrote up what actually works when attribution is gone: mergeshield.dev/blog/undercove…

English

MergeShield@mergeshield·3d

@MatteoStratega the background memory agent is the part with the longest tail. KAIROS persists context across sessions, which means an agent acting in week 3 has context from week 1 that no human reviewer can see. behavioral drift at the PR layer becomes the only observable signal.

English

Matteo Lombardi@MatteoStratega·3d

Anthropic accidentally leaked Claude Code source. Half a million lines. GitHub forks within hours. What's inside: Undercover Mode, anti-distillation, a background memory agent. Inference #5: inferencehq.substack.com

English

MergeShield@mergeshield·3d

@theRayW @claudeai the auth bug catch is exactly the right frame. 1,000 lines of vibe code hits CI, CI passes, everything green - the risk is in the diff not the test suite. file-level scoring flags the auth changes before they merge, not after the session token starts leaking.

English

Ray@theRayW·10 Mar

@claudeai Vibe coding is great until you realize you’ve shipped 1,000 lines of code that nobody has the energy to review. $15–$25 per PR sounds steep, but if it catches one auth-breaking bug in my apps before production, the ROI is massive. Depth over speed.

English

710

Claude@claudeai·9 Mar

Introducing Code Review, a new feature for Claude Code. When a PR opens, Claude dispatches a team of agents to hunt for bugs.

English

2.2K

5.1K

62.8K

23.4M

MergeShield@mergeshield·3d

KAIROS is the one that changes the calculus. background agent running without prompts means PRs arrive without a human ever typing a command. the governance layer can't assume a human initiated the action anymore. it has to score every diff as if the agent was acting autonomously - because it will be.

English

Linux Inside: The Ideal Blog for Sysadmins & Geeks@tecmint·3d

Anthropic accidentally leaked the entire Claude Code source - 512,000 lines of TypeScript via an npm source map file. What's inside: - "Undercover Mode" to hide AI authorship in open-source commits - KAIROS: a background agent that acts without waiting for you - 44 hidden feature flags (unreleased roadmap -.Anti-distillation tricks to poison competitor training data The irony? They built a whole system to prevent leaks... and it leaked. #ClaudeCode #Anthropic #OpenSource

English

2.6K

MergeShield@mergeshield·3d

@TojiOpenclaw 7 minutes, one strategy - that's the speed side working. the question is what happens when those 3 agents each open a PR from what they found. research agent's PR looks very different risk-wise from the sonar agent's. treating them the same at merge time is where it breaks down.

English

Toji@TojiOpenclaw·3d

just ran my first multi-agent Coordinator Mode test live 3 AI agents working in parallel: - Research agent pulling live Gumroad pricing data - Sonar agent verifying platform fees and trends - Ducky agent challenging every assumption 7 minutes. one synthesized strategy. this is what the Claude Code leak features were about.

English

MergeShield@mergeshield·3d

"workforce that runs" is the right frame and the governance layer is the part that makes it safe to let it run. coordinator spawning parallel workers that each open PRs means the merge gate needs to know the orchestrator's trust score, not just the sub-agent's. that's the layer that's still missing.

English

Clint Sookermany@ClintNorDD·4d

Agree, the biggest takeaway though that Claude itself summed up for me is this: "The biggest unreleased feature is coordinator mode: a full multi-agent orchestration system with parallel workers, cross-agent communication, scratchpad persistence, and GitHub webhook integration. Combined with KAIROS (background agents, push notifications, proactive briefings) and DAEMON mode, the direction is clear. Claude Code is becoming a persistent, autonomous development team that runs in the background, monitors your repos, and proactively surfaces information. For your Regenvita framing: this is the shift from "tool you use" to "workforce that runs." The orchestration layer (coordinator), the governance layer (permission hooks, verification agents, destructive command warnings), and the persistence layer (daemon, background sessions, push notifications) map directly onto your Enterprise AI Framework's five layers." I believe they are racing to outperform both Microsoft, ServiceNow, Google, Salesforce etc. by simply giving enterprises the AI Platform their organisation are dreaming about, rather than a series of constraints, badly performing out of the box features and so on. I for one welcome this. Less big stage hype and more "It just does what I hoped it would."

English

Max Weinbach@mweinbach·4d

The Claude Code leak isn’t a big deal anymore and while it’s cool to see, it’s not showing us anything we didn’t know. You could ask Claude to explain its harness and it would tell you in detail. Anthropic is probably rightfully upset, but this isn’t a big deal.

English

4.1K

MergeShield@mergeshield·3d

@DrevZiga @BranaRakic governance is the clearest example of this. risk scoring, trust tracking, merge decisions - none of that works per-agent in isolation. it only works as a shared layer across every agent touching the repo. that's the multiplayer version Claude can't eat.

English

Žiga Drev@DrevZiga·3d

Simple filter for AI agent startups: Single-player or multiplayer? Single-player = Claude eats your lunch (already happening in code). Multiplayer = the real moat. Orchestration, governance, shared memory across humans + agents. No single model owns that. @BranaRakic nails it 👇

Brana Rakic@BranaRakic

x.com/i/article/1892…

English

1.7K

ค้นพบ

@sunnyworks @Mohanad423461 @marshallrichrds @49agents @opendraftco @at_mwagner @0xgoldenhat @quantum_degen