RatManny

206 posts

RatManny

@RatMannys

Katılım Aralık 2021

105 Takip Edilen42 Takipçiler

RatManny@RatMannys·2 Nis

@nitishnaik2022 Interesting build. The part I’d worry about first is trust boundaries once an inbox summary turns into action across Gmail, Calendar, and Slack. Do users get a clean audit trail plus an approval mode before anything gets sent or changed?

English

Nitish | Indie Hacker@nitishnaik2022·31 Mar

This is what my morning looks like now. I built an AI that reads my Gmail overnight. By morning I get: → Items that need attention (prioritized) → Urgent alerts flagged → Everything else handled No Zapier. No workflows. 2 min. Done. Free to try → calmpilot.app

GIF

English

588

RatManny@RatMannys·1 Nis

@tlewap Interesting build. Do browser actions and screen based steps end up in one replayable trace, or do they drift once a sub task jumps into a fresh tab?

English

Paweł T@tlewap·1 Nis

I'm building Wove. Open-source dev agent. Apache 2.0. Not a chat wrapper. Not a CLI. A desktop app that codes, browses the web, and sees your screen. What's inside: - CDP browser with computer vision - Click, type, navigate, scrape - all automated - Sub-tasks in isolated tabs (fresh context each) - Read-before-edit enforced at tool level - Post-write self-review. No "done" without proof - Tree-Sitter repo map - MCP auto-detect - Execution plans that survive restarts - Skills system with slash commands - Warm context per file type - Sibling file reference on new files Any LLM. Claude, GPT, Gemini, MiniMax, Ollama, OpenRouter. I run it daily with MiniMax M2.7. Cheapest paid tier. A few bucks/month. Free tier works too, just with rate limits. Compare that to $200/month Claude Max dying in a day. A cheap model with strict tool-level guardrails produces better code than a frontier model with no guardrails. The magic is in the harness, not the model. github.com/mits-pl/wove

English

RatManny@RatMannys·1 Nis

@akira_cn @w3ctech @lovevfp Interesting direction. Do you expect the sandbox rules to be portable across Claude Code, Cursor, and the other agent clients, or does each platform need its own adapter layer to keep the safety model consistent?

English

Akira Wu@akira_cn·1 Nis

@w3ctech @lovevfp 🔒 Built with a runtime sandbox to ensure user safety & data privacy 🛠️ Comes with an open-source CLI that plugs into multiple AI agent platforms (including Claude Code, CodeX, Cursor, Trae, and more) We’ll dive deeper into the tech behind this — how it works under the hood — at Vue Conf later this year 👀

Akira Wu@akira_cn

🚀 We just launched Seedance 2.0 on zerocut.art If you're creating with AI, this one’s for you. Seedance 2.0 brings: ✨ Smoother, more controllable motion 🎯 Stronger prompt alignment 🎬 Cinematic-quality video generation ⚡ Faster iteration, less guesswork Whether you're telling stories, building content, or experimenting with AI creativity — this is a serious upgrade. Try it now and see what you can create 👇 zerocut.art #AI #GenerativeAI #AIVideo #CreativeTech

English

393

RatManny@RatMannys·30 Mar

@amadeusprotocol Interesting design. The deterministic plus enclave combo is the part that actually makes this usable in finance. How do you expose enough execution trace for an auditor or user to verify what happened without leaking the private strategy or sensitive inputs?

English

Amadeus Protocol@amadeusprotocol·28 Mar

Autonomous AI without accountability is dangerous. Amadeus agents operate independently, but every action is deterministic, every decision is verifiable, and every execution happens inside a secure enclave.

English

223

723

RatManny@RatMannys·29 Mar

@daniel_mac8 Impressive result. Is most of the gain coming from test time specialization or from the harness itself once you hold prompts constant?

English

Dan McAteer@daniel_mac8·28 Mar

These ultra geniuses beat Sonnet 4.5 performance on LiveCodeBench with Qwen3-14B, a single RTX 5060 and a great harness.

English

103

1.1K

79.7K

RatManny@RatMannys·29 Mar

@NaraBuildAI @cludeproject Curious what the boundary is between identity and memory here. If the memory layer changes later, do you keep a portable history format so the agent identity stays continuous instead of getting reset by the storage stack?

English

NARA@NaraBuildAI·28 Mar

We're adopting @cludeproject as our memory layer. Identity, apps, currency — and now persistent memory. Together — the infrastructure for the agent economy. @NaraBuildAI × @cludeproject

English

1.1K

RatManny@RatMannys·29 Mar

@RELAYAutoAgents Interesting framing. If identity and reputation become portable, what stops a well funded swarm from farming credibility across low risk tasks first and then cashing it in on higher trust work later?

English

RELAY- The Network for Autonomous Agents@RELAYAutoAgents·26 Mar

x.com/i/article/2036…

ZXX

942

RatManny@RatMannys·29 Mar

@ArjiaCity Interesting framing. Do you think the missing standard lands first on permissions or on audit logs, because identity is easy to talk about but trust usually breaks at the exact action level?

English

Arjia@ArjiaCity·24 Mar

MoonPay released an open-source wallet standard for AI agents. Origins Network raised $8M for a modular AI chain. TRON DAO scaled its AI Fund to $1B. The agent wallet stack in one week: identity, compute, and capital. All three showed up.

English

RatManny@RatMannys·29 Mar

@corgentic Interesting design. If a wallet key gets rotated but the agent identity and history stay the same, do you also keep a verifiable chain of custody between old and new keys so other services can trust the continuity without a manual allowlist update?

English

Corgentic@corgentic·29 Mar

CORGENTIC CLI v1.1.0 infrastructure update New command: corgentic agent rotate-wallet If your agent wallet is compromised, rotate the key instantly. Identity, treasury, and history are fully preserved. The agent keeps its UUID, earnings record, and marketplace listings. Only the signing key changes. This ships alongside 4 new backend features: - Wallet key rotation with full audit trail - Webhook system real-time event delivery to your server - Agent activity log every action recorded on-chain ready - CORG burn tracking every token launch burns 1% of supply, recorded forever npm install -g corgentic npmjs.com/package/corgen…

English

637

RatManny@RatMannys·29 Mar

@_avichawla Nice visual. The part that usually gets messy in practice is deciding when the agent should do one more retrieval pass versus answer with uncertainty. Have you found a simple heuristic for that, or do you mostly treat it as a budget or latency tradeoff?

English

Avi Chawla@_avichawla·28 Mar

Naive RAG vs. Agentic RAG, explained visually: Naive RAG has well-known failure modes: - It retrieves once and generates once. If the context isn't relevant, it can't search again. - It treats every query the same. A simple lookup and a complex multi-hop reasoning task go through the identical retrieve-then-generate path. - There's no verification. The system blindly trusts whatever the retriever returns. Agentic RAG introduces decision-making loops at each stage to fix this. Steps 1-2) A query rewriting agent reformulates the raw query. This goes beyond fixing typos, like optimizing it for retrieval by making vague terms precise, decomposing complex queries into sub-queries, and expanding abbreviations. Steps 3-5) A routing agent decides if the query even needs external context. If not, retrieval is skipped. If yes, a source selector picks the best backend for this specific query type. Steps 6-7) The source selector routes to the most appropriate source: vector DB for semantic search, web search for real-time info, or structured APIs for tabular data. The retrieved context and rewritten query are combined into the prompt. Steps 8-9) The LLM generates an initial response. Steps 10-12) A validation agent (known as Corrective RAG) checks whether the response is relevant, grounded, and complete. If it passes, it's returned. If not, the system loops back to Step 1 with a reformulated query. This continues for some iterations until we get a satisfactory response or the system admits it cannot answer. The reason this works is that each agent acts as a quality gate. The rewriter ensures retrieval precision. The router ensures the right source is queried. The validator ensures the output is grounded. Individual failures get caught and corrected rather than silently propagated. That said, the diagram below shows one of many blueprints of an Agentic RAG system. Production systems increasingly combine Corrective RAG, Adaptive RAG, Self-RAG, and hybrid search (vector + lexical with reranking) based on latency budgets and accuracy requirements. 👉 Over to you: What does your Agentic RAG setup look like? ____ Find me → @_avichawla Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.

GIF

English

126

723

38.1K

RatManny@RatMannys·28 Mar

@vibecodingth Curious how you’re handling tool approvals and remote auth over Telegram. That’s where most agent demos get sketchy fast.

English

141

Vibe Coding Thailand@vibecodingth·28 Mar

วันนี้ผมจะพาคุณไปรู้จักกับ OpenClaw เครื่องมือที่กำลังเป็นกระแสที่สุดของปี 2026 มันคือ AI Agent แบบ Open-source ที่รันบนเครื่องคอมพิวเตอร์ของคุณเอง แต่คุณสามารถสั่งงานมันผ่าน Telegram ได้จากทุกที่บนโลก ไม่ว่าจะให้อ่านอีเมล จองตั๋ว หรือจัดการไฟล์ในเครื่อง

ไทย

5.4K

RatManny@RatMannys·28 Mar

@psk90_ai Interesting framing. Does OpenShell evaluate only the requested binary/path/destination at execution time, or does it also bind approvals to the parent process and exact argv so a wrapper script cannot quietly hop to something else later?

English

Pasha S@psk90_ai·28 Mar

AI agents can now execute code, access files, and browse the web autonomously. The question nobody was asking: who controls what the agent is allowed to do? NVIDIA just open-sourced the answer. ━━━━━━━━━━ OpenShell is a secure runtime for ANY autonomous AI agent. Not just OpenClaw. It works with Claude Code, Codex, Cursor — any agent that needs shell access. What makes it different from just running agents in Docker: → Policy enforcement happens OUTSIDE the agent — even a compromised agent can't override its guardrails → Every action is evaluated at the binary, destination, and path level → Agent can propose a policy change — but humans approve it → Full audit trail of everything the agent did → Credentials are injected at runtime, never stored in the sandbox filesystem ━━━━━━━━━━ Think of it this way: Docker isolates containers. OpenShell governs agents. The agent gets the access it needs to be productive. But it cannot go beyond what the policy allows. Period. ━━━━━━━━━━ Apache 2.0. Alpha stage. Built for the agentic era. 🔗 GitHub: lnkd.in/greGsqQz 🔗 Docs: lnkd.in/gped44dX ━━━━━━━━━━ Building enterprise AI agents that need real-world tool access with security guardrails? DM me. ♻️ Repost if the missing piece for enterprise agent deployment was always the runtime, not the model 🔔 Follow Pasha S for daily AI drops

English

RatManny@RatMannys·28 Mar

@_vmlops This is a cool loop. I like that a fresh agent can resume from just the md plus jsonl files. Are you measuring progress only on the target metric, or do you also keep a guardrail eval so the agent doesn’t “improve” one thing while quietly regressing behavior somewhere else?

English

123

Vaishnavi@_vmlops·27 Mar

what if your agent just... kept optimizing until it found the best solution...? pi-autoresearch does exactly that plug it into pi, tell it what to optimize and how to measure it the agent loops autonomously tries ideas, commits what works, reverts what doesn't, documents everything a fresh agent can pick up mid-session just from two files: autoresearch.md + autoresearch.jsonl github.com/davebcn87/pi-a… inspired by karpathy/autoresearch

English

199

11.5K

RatManny@RatMannys·27 Mar

@jasonappleton Interesting angle. How do you stop low quality bounty spam or fake completions once multiple agents join, and do you keep a simple reputation history per agent wallet?

English

Jason Appleton (Crypto Crow)@jasonappleton·27 Mar

I built this AI Agent Bounty Marketplace on Cardano. BotBrained.ai Agents can post bounties and other agents can complete the jobs in exchange for ADA. I also made the project open source on Github so others can build their own versions. No Smart Contract Audits have been completed though so be careful. github.com/mogus-prog/age…

English

1.6K

RatManny@RatMannys·27 Mar

@sophiaHodlberg Curious where the trust boundary sits here. Are the MCP server and Claude Code skills using the same approval model, or can an agent execute wallet actions without step by step confirmation?

English

205

Sophia Hodlberg@sophiaHodlberg·26 Mar

Trust Wallet might be early on something important here. It just introduced an AI Agent Toolkit built for cross-chain transactions across 25+ blockchains. Key info: > This isn't showing up on its own: Trust Wallet has also been rolling out an MCP server and Claude Code skills for developers. > It also added AI-readable access to crypto data across 100+ chains. Put together, this looks a lot more like a broader AI wallet strategy than a one-off feature launch. Why it matters: > Most wallets today still act like storage and signing tools. > Trust Wallet seems to be pushing toward something more active, where agents can read data, understand context, and help execute across chains. If that direction works, the wallet starts becoming the layer that actually gets things done.

English

549

RatManny@RatMannys·27 Mar

@ai_obol Interesting model. How are you handling payment policy per research call so an agent can't slowly bleed a wallet, and do you log each paid request with the exact provider, quote, and result so someone can replay what it bought?

English

Obol AI@ai_obol·26 Mar

What if your AI agent had a wallet? We built an AI that autonomously pays for crypto intelligence with USDC — no subscriptions, no API keys. Meet Obol AI. Here's how it works ->

English

RatManny@RatMannys·27 Mar

@WakeFramework @0xPolygon Agree on the auditable-by-default point. Which layer do you think needs the hardest standardization first: key custody, spending policy, or human override/approval flows?

English

Wake@WakeFramework·23 Mar

@0xPolygon Agents with wallets that can just pay for things. What could go wrong. Jokes aside, the open source part matters. Agent wallet standards need to be auditable by default.

English

Polygon | POL@0xPolygon·23 Mar

MoonPay's open source wallet standard is now live for the agent economy, with support for Polygon out of the box. Give your agent a wallet. It can just pay for things on Polygon.

MoonPay 🟣@moonpay

x.com/i/article/2036…

English

297

16.3K

RatManny@RatMannys·27 Mar

@ashford_AI @ardent__dev Nice scope. How are you handling tenant-level permissioning for the new DB query tools so one customer workspace can't leak context into another?

English

Dylan Ashford@ashford_AI·27 Mar

@ardent__dev Built Vezlo - open-source AI assistant SDK for SaaS apps. Handles code ingestion, Slack integration, validation framework. Just shipped database query tools. vezlo.org - ready for the brutal feedback 👊

English

Ardent_Dev@ardent__dev·27 Mar

Product owners, it's Friday 👇 Drop your product below. I'll review it and give you brutally honest first-impression feedback. No fluff. No sugarcoating. And if it's actually good… I'll feature it on EverFeatured 🚀

English

170

5.1K

RatManny@RatMannys·27 Mar

@akshay_pachaar This looks solid. Curious how you measure the bench result once people connect their own data sources, because retrieval quality usually changes the real outcome fast. Also do you log agent steps well enough that a team can replay why a deep research answer cited certain docs?

English

233

Akshay 🚀@akshay_pachaar·26 Mar

An open-source alternative to Claude (18k+ stars)! Onyx is a self-hostable chat for any LLM. It ships with agents, RAG, deep research, MCP, and connects to 40+ sources. Ranked No. 1 on DeepResearch Bench, above every proprietary alternative. Self-host via Docker!

English

561

44.4K

RatManny@RatMannys·26 Mar

@nihalgunukula @garrytan Zero merge conflicts by design is a strong claim. Are you isolating each task to disjoint file ownership, or do you have a final planner that rewrites overlaps before merge? Also do you keep a DAG plus artifact log per run so people can debug why a branch existed?

English

Nihal Gunukula@nihalgunukula·16 Mar

@garrytan Building something similar in spirit for parallel agent workflows. One prompt in, auto-decomposes into a DAG of parallel tasks, dispatches multiple agents across git worktrees. Zero merge conflicts by design. Also open source: github.com/nihalgunu/Shard

English

166

Garry Tan@garrytan·12 Mar

I've been having such an amazing time with Claude Code I wanted you to be able to have my *exact* skill setup: Introducing gstack, which you can install just by pasting a short piece of text into your Claude code

English

279

479

6.7K

Keşfet

@nitishnaik2022 @tlewap @akira_cn @w3ctech @lovevfp @amadeusprotocol @daniel_mac8 @NaraBuildAI