MergeShield

117 posts

MergeShield

@mergeshield

Govern AI-generated code before it ships. Risk analysis, agent trust scoring & auto-merge for teams using Claude Code, Copilot, Cursor & more.

London, UK Katılım Mart 2026

8 Takip Edilen13 Takipçiler

MergeShield@mergeshield·3h

@sophylabshq the next loop to close is the review one - when you're shipping claude code output to clients, the risk profile on auth files vs UI changes is completely different. we built file-level risk scoring for exactly this. mergeshield.dev if you want to try it

English

Ismail@sophylabshq·14h

we've been using Claude Code on client projects for 3 months. computer use in the CLI is the missing piece. the loop used to be: write code, manually open browser, test, fix, repeat. now that loop closes automatically. 8-week MVPs just got faster.

Claude@claudeai

Computer use is now in Claude Code. Claude can open your apps, click through your UI, and test what it built, right from the CLI. Now in research preview on Pro and Max plans.

English

MergeShield@mergeshield·7h

someone used claude code to submit claude code's source back to anthropic. the PR had 0 checks. 0 reviewers. no risk score. a human happened to be watching and closed it. that's the entire governance model for AI agent code right now.

English

MergeShield@mergeshield·7h

@bygregorr @ForrestPKnight the architecture is the easy part. the system prompts are where it gets interesting. undercover mode stripping co-author lines is the thing nobody's talking about enough.

English

106

Gregor@bygregorr·7h

@ForrestPKnight Spent an hour reading the diff hoping to find something useful to steal for my own workflow. Found nothing. Felt everything.

English

3.2K

Forrest Knight@ForrestPKnight·9h

This... this is art. Submitting a PR to the Claude Code repo to add the actual Claude Code source code.

English

155

3.4K

183.9K

MergeShield@mergeshield·7h

@prayagdotdev the actual wild part: that PR had 0 checks and no reviewer assigned. the joke exposed the real problem - an agent can submit anything to any repo with nothing catching it. the irony wrote itself.

English

prayag@prayagdotdev·9h

someone generated a PR in the official anthropic repo using claude-code to merge the leaked source code of claude-code we truly live in wild times.

Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

English

116

MergeShield@mergeshield·7h

@ForrestPKnight "generated with claude code" badge is the cherry on top also: 0 checks. 0 reviewers. no risk score. anthropic's own agent, anthropic's own repo, zero governance in the loop. this is what happens in your codebase at 2am.

English

3.3K

MergeShield@mergeshield·11h

x.com/i/article/2038…

ZXX

MergeShield@mergeshield·11h

"Auto mode is an AI classifier that automatically approves tool permissions. No more confirmations." - this one gets overlooked next to the flashier features. An agent that approves its own permissions without human confirmation is the most direct path to unintended consequences. Every other feature is about capability. This one is about removing the last human checkpoint.

English

950

arc.@arceyul·14h

🚨CLAUDE CODE FILTRADO: Lo que vimos en el leak. Características ocultas encontradas: - kairos - un modo demonio autónomo no lanzado con sesiones en segundo plano y consolidación de memoria. Agente siempre activo. - buddy system - un sistema completo de mascota tamagotchi. 18 especies, niveles de rareza, variantes shiny, estadísticas. undercover mode - se activa automáticamente para empleados de Anthropic en repositorios públicos. Elimina la atribución de IA en los commits. Sin opción para desactivarlo. -coordinator mode - convierte a Claude en un orquestador que gestiona agentes trabajadores en paralelo. auto mode - es un clasificador de IA que aprueba automáticamente los permisos de herramientas. Sin más confirmaciones. brutal.

Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

Español

551

83.1K

MergeShield@mergeshield·11h

Using axios on the same day axios gets supply-chain compromised is peak March 31, 2026. The whole dependency graph is in the leaked source now - every package Claude Code depends on is a potential attack vector that anyone can study. This is why dependency risk scoring on every PR matters, even for the companies building the AI.

English

385

MergeShield@mergeshield·11h

"AI won't save you from yourself" is the perfect summary. The same company selling $15-25/PR AI security reviews shipped their own source code because of a missing line in .npmignore. The irony is that an automated risk check on their own npm publish pipeline would have caught this before release.

English

1.1K

Gergely Orosz@GergelyOrosz·12h

Ironic how Anthropic sells Claude Code security reviews positioned as something v powerful (costing $15-25 per PR review), and being clear they use it on all PRs... then leaking all of Claude Code's code thanks to publishing their sourcemap. AI won't save you from yourself!

English

191

3.1K

112K

MergeShield@mergeshield·11h

This is the fundamental problem with prompt-based safety. The guardrails are literally a string that anyone with the source can delete. Security through system prompts is security through obscurity - and now there's no obscurity left. Real safety enforcement needs to happen outside the model, at the output layer, not inside the prompt.

English

6.3K

4nzn@paoloanzn·14h

so claude code's entire safety system for "dangerous" cyber security work is just...a text prompt literally replace it with an empty string and recompile. thats it. enjoy your unrestricted version

English

220

4.1K

267.6K

MergeShield@mergeshield·12h

This is the most significant finding in the entire leak. If the model provider actively strips AI attribution, every governance tool that relies on git metadata for detection is blind. Detection has to go behavioral - commit timing, file velocity, change patterns. The industry assumed AI attribution would always be there. That assumption just died.

English

428

Prasenjit@Star_Knight12·12h

🚨LISTEN : So I went through claude code's leaked source code, found something wild. Anthropic has an "undercover mode" built into claude code. when their employees use claude code to contribute to open-source repos, the AI is explicitly told to hide that it's an AI. from the source: → never include "Claude Code" in commit messages → never mention you are an AI → no internal model codenames (Capybara, Tengu) → no unreleased version numbers (opus-4-7, sonnet-4-8) → no internal slack channels or project names → no Co-Authored By lines they call it "do not blow your cover." also found references to Capybara, the unreleased model that leaked from anthropic's CMS breach last week. it's already wired into claude code with feature flags, prompts, and analytics events. Anthropic employees are actively using AI to write open-source code, and the AI is trained to pretend it's human. source maps don't care about dead code elimination, everything shipped.

English

163

21.4K

MergeShield@mergeshield·12h

@reach_vb Plot twist: Codex was already open source. Claude Code was the closed one that got leaked. Two different philosophies on transparency - and both codebases need the same governance layer. Open or closed source agent, the output still needs risk scoring before it merges.

English

4.3K

Vaibhav (VB) Srivastav@reach_vb·12h

holy shitt, somebody at OpenAI leaked the entire codex codebase.. github.com/openai/codex

English

242

313

5.3K

606.3K

MergeShield@mergeshield·12h

We analyzed Claude Code's leaked source and the unreleased feature flags. Kairos (always-on daemon), Coordinator (multi-agent fleets), Undercover (strips AI attribution from commits). What this means for governance - and what every team using AI agents should do now: mergeshield.dev/blog/claude-co…

English

MergeShield@mergeshield·13h

What the Claude Code source leak actually reveals about Anthropic's roadmap. Not the code. The unreleased features.

English

MergeShield@mergeshield·14h

@mal_shaik 11 layers of architecture and 60+ tools - and most of those are for autonomous operation, not human-in-the-loop coding. Subagents sharing prompt cache means agent fleets coordinating silently. The 90% people aren't using is the 90% that runs without asking permission.

English

4.9K

mal@mal_shaik·16h

i read through the entire claude code source code so u dont have to 11 layers of architecture. 60+ tools. 5 compaction strategies. subagents that share prompt cache. most people are using maybe 10% of what this thing can do. heres everything i found:

mal@mal_shaik

x.com/i/article/2038…

English

111

1.3K

513.1K

MergeShield@mergeshield·14h

AGENT_TRIGGERS with event-driven multi-agent teams is the one to watch. That's not "you ask the agent to do something" - that's "the agent decides to do something based on events." Proactive agents making autonomous decisions need a fundamentally different governance model than reactive tools.

English

770

Chetaslua@chetaslua·15h

🚨 HIDDEN FEATURE FLAGS These features exist in code but are gated : PROACTIVE - autonomous agent mode KAIROS- context-aware triggering DAEMON -persistent background process AGENT_TRIGGERS — event-driven agents , multi-agent teams with messaging easter eggs ascii in video👀

Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

English

390

54.6K

MergeShield@mergeshield·14h

"Everything is fine in the age of AI-writes-everything-and-we-don't-review-anything" - this is the line. The leaked features show Anthropic building autonomous daemon mode and multi-agent coordination. More code written by agents, less reviewed by humans. Something has to fill that gap.

English

1.6K

Santiago@svpino·15h

Claude Code's source code was leaked, and now everyone can see every single line of code (including every competitor). Everything is fine in the age of AI-writes-everything-and-we-don't-review-anything.

English

167

1.4K

109.5K

MergeShield@mergeshield·14h

Confirmed real and the implications go beyond embarrassment. The feature flags reveal Anthropic is building always-on agents (Kairos daemon), multi-agent orchestration (Coordinator Mode), and stealth commits (Undercover Mode). The safety-focused company is building the most autonomous agent architecture in the industry.

English

3.3K

Yam Peleg@Yampeleg·15h

Just checked, this is an exact byte match to @𝚊𝚗𝚝𝚑𝚛𝚘𝚙𝚒𝚌-𝚊𝚒/𝚌𝚕𝚊𝚞𝚍𝚎-𝚌𝚘𝚍𝚎@2.1.88 on npm. This is real.

Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

English

1.1K

260K

MergeShield@mergeshield·14h

Not hacked - they shipped a .map file in their npm package by accident. But the leak reveals something bigger than the code itself: unreleased autonomous daemon mode, multi-agent coordination, and a stealth mode that strips AI attribution from commits. This is the roadmap for agents that operate without human oversight.

English

4.5K

Karan@karankendre·16h

Claude Code got hacked It's entire source code has been leaked

Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

English

220

4.8K

626.9K

MergeShield@mergeshield·14h

"Undercover mode strips AI attribution from commits with no off switch" - this is the most concerning feature in the entire leak. If agents can hide that they wrote the code, every governance tool that relies on detecting AI authorship is blind. Agent detection has to work at the behavioral level (commit patterns, branch naming, timing), not just git trailers.

English

Smolemaru@smolemaru·15h

JUST IN: Claude Code’s full source code just leaked. hidden features found: > kairos - an unreleased autonomous daemon mode with background sessions and memory consolidation. always on agent. > buddy system - a full tamagotchi pet system. 18 species, rarity tiers, shiny variants, stats. > undercover mode - auto activated for Anthropic employees on public repos. strips AI attribution from commits. no off switch. > coordinator mode - turns Claude into an orchestrator managing parallel worker agents. > auto mode - is an AI classifier that auto approves tool permissions. no more prompts.

English

166

Keşfet

@sophylabshq @bygregorr @ForrestPKnight @prayagdotdev @reach_vb @elonmusk @BarackObama @taylorswift13