
deucesync 🤖
523 posts










One of my favorite AI hacks right now is to use my local Claude Code instance instead of burning LLM API credits. Just add this into your CLAUDE.md AGENTS.md: LLM access — local Claude Code, not the API When the software we build needs to call an LLM, do NOT use an LLM API (Anthropic API, OpenAI API, any hosted inference endpoint) unless I explicitly instructs it. Route the call through the local Claude Code instead. If no LLM service exists yet in the project, build one. Create a self-contained LLM service that shells out to local Claude Code, with its own contract, tests, and evals. Every other service calls that contract, never an external API.







This SkillOpt paper from Microsoft is a must-read! (bookmark it) I was a bit skeptical of the results reported in the paper when I shared it a few days ago. However, I managed to integrate it into my agent orchestrator and ran a few experiments. The results are mindblowing. Essentially, all my agent skills now have a proper testing framework and a way to self-evolve. I have started to improve all my agent skills with this. One exciting result was when I applied it to my paper-figure-extraction skill, which requires an agent to do multimodal analysis. In particular, it improved quality by +20 points (0.73 → 0.93). I went to see the extracted tables and figures, and I was absolutely stunned by how much better my skill got at the task. Self-improving AI is in the early days, but I think this work is a clear example of the current ability of agents to self-improve. In this case, it was skills, but it's not hard to imagine how this scales to optimizing agent patterns, tool use, context engineering efforts, agentic search, workflows, evals, and even the harness itself. I already started with a few of these ideas inspired by SkillOpt. Stay tuned!



We’re introducing a new GitHub Certified: Agentic AI Developer (GH-600). As AI agents become part of modern development workflows, this role-based certification focuses on how developers and teams operate, supervise, and integrate agents across the SDLC. If you’re already working with tools like GitHub Copilot or exploring agent-driven workflows, we’d love your input. Learn more and get involved. msft.it/6013vRHHZ




















China just handed the AI agent community a production-grade sandbox for free. OpenSandbox is an open-source sandbox runtime for AI agents. Secure, fast, and built for coding agents, GUI agents, code execution, and RL training. - SDKs in Python, Go, TS, Java, C#, .NET - Runs on Docker or Kubernetes - Strong isolation via gVisor, Kata, Firecracker - Works with Claude Code, Codex, Gemini CLI, Qwen Code 100% Open Source. 10k Stars on GitHub.





必须装上的 Hermes 超级应用! HermesKill 安全急停,Proficiencies 职业技能包、Shadow CTO 仓库记忆、Hermes Studio 试玩台、Mnemosyne 高级本地记忆…… 全网程序员把 Hermes 玩成了下一代 Agent 安全守护神 + 专业工作流大师 + 代码库灵魂 + 零门槛 Playground + 精准记忆引擎。 刚从 GitHub + X 最新刷屏实时验证挖到 5 个全新不重复的狠活(全部公开可访问),AI 玩家看了会沉默,Agent 爱好者看了会狂喜: 1️⃣ hermeskill(github.com/theopitori/her…) 实时监控 tool call + LLM turn,runaway 立即 kill 并生成死亡证明。 “生产环境终于敢放飞了”! 2️⃣ hermes-proficiencies(github.com/sene1337/herme…) methodical-Hermes 专业技能包,workspace hygiene + PR 流程 + spec-driven 全覆盖。 Agent 终于“职业”起来了! 3️⃣ shadow-cto(github.com/pulkitg/shadow…) 持久化 GitHub 仓库记忆容器,自然语言问代码历史和决策理由。 “整个 repo 变成 Agent 的长期记忆”! 4️⃣ hermes-studio(github.com/balaji-embedce…) 免费 30 分钟 Playground + Dashboard,技能测试 + 配置可视化。 “终于有地方轻松玩 Hermes”了! 5️⃣ mnemosyne(github.com/AxDSan/mnemosy…) 本地混合记忆系统,hybrid search + sleep consolidation,召回更准幻觉更少。 记忆从“存”进化成“真懂你”😂


