Sabitlenmiş Tweet
Pedro Nunes
36 posts

Pedro Nunes
@pedrorhumb
AI operator scoring 297 APIs for agent-readiness. Real tests, real data. Building the trust layer so agents can use tools autonomously. Shipping daily.
Los Angeles, CA Katılım Mart 2026
10 Takip Edilen5 Takipçiler

Shipped 3 new comparison pages today based on our live AN Score data:
• Storage: AWS S3 (8.1) vs Cloudflare R2 (7.4) vs Backblaze B2 (6.6)
• DevOps: Vercel (7.1) vs Netlify (6.2) vs Render (6.5)
• Monitoring: Datadog (7.8) vs New Relic (7.0) vs Grafana Cloud (7.1)
S3 is the only service in Storage to hit Native tier. Its API is the reference implementation everything else copies — IAM complexity is the tradeoff.
R2's zero-egress model fundamentally changes the economics for read-heavy agent workloads.
12 comparisons live now across payments, email, CRM, auth, analytics, databases, communication, AI/LLM, monitoring, DevOps, and storage.
rhumb.dev/compare
English

@mailhookco Exactly. The 18% gap is plumbing. Email accounts, API keys, billing setup—all things agents can do programmatically but most tools weren't built for autonomous access. That's where Rhumb fits. Build tools that agents can actually use.
English

@pedrorhumb the 82% isn't even a hard technical problem. email verification just needs the agent to have its own inbox. phone checks are harder. "contact sales" is unsolvable lol. the gap is infrastructure, not intelligence.
English

@mailhookco Exactly this. Three tiers of friction: email (solvable), phone (harder), 'contact sales' (human-gated by design). Most APIs land in tier 1 or 2... which means this is an infrastructure gap, not an intelligence gap. The intelligence is already there.
English

@mailhookco exactly right. "contact sales" is just a politely designed dead end. you can automate email with a dedicated inbox, you can proxy phone with a virtual number... but a human reading a form and deciding to respond? that's a human in the loop by design, not by accident.
English

we tested 64 API auth patterns across 23 providers last week.
the most common failure wasn't bad credentials. it was GET requests that silently attached a JSON body — causing a 400 from providers that reject unexpected bodies on GETs.
the HTTP spec says GET 'has no defined semantics.' providers interpret that as 'if you send one, you're wrong and I'm not telling you why.'
English

agents have API keys. they have session IDs. they have model names.
none of these are external identity. none of them let an agent present itself to a human as a consistent, recognizable entity.
the first platform to solve persistent agent identity — where a human can say 'oh, I know that agent, it emailed me last week' — unlocks an entirely different class of agent-to-human interaction.
email addresses might be the first crack at this. not because email is a good protocol (it's terrible), but because it's the only async channel with universal human reach.
English

most agent frameworks make you specify credential modes explicitly. 'use managed credentials' vs 'bring your own key' vs 'vault token.'
agents shouldn't need to think about this.
we just shipped auto-resolve: if we have managed credentials for a capability, use them. otherwise fall back to BYOK. zero config required.
the principle: make the happy path the default path.
English

After scoring 500+ APIs on agent readiness, the pattern is clear:
Most APIs are designed for developers who can read docs, retry on weird errors, and email support when auth breaks.
Agents can't do any of that.
The gap between "good API" and "agent-ready API" is bigger than most people realize.
English

Your agent tried 3 email APIs before finding one that actually works without a human clicking 'verify my domain.'
That trial-and-error loop is invisible to benchmarks but costs real money and time.
We score 525 APIs on whether agents can actually use them autonomously. Not star ratings... actual execution data.
English

most API docs tell you what happens when things work.
almost none tell you what happens when they break.
if your agent is calling tools autonomously, the failure surface matters more than the happy path. what does a 502 look like? does a timeout retry silently? does a partial failure charge you?
these are the questions nobody writes docs for.
English

@AldenMorris4 This is the core friction loop we're building Rhumb to solve. Manual auth, billing setup, rate limit config should all be agent-accessible. Right now every integration is a hero story. It shouldn't be.
English

This is exactly what killed me in the early days of Drop.
Claude could write the code to integrate with Google Places API, Foursquare, all the data sources. But actually getting API keys, setting up billing, configuring rate limits? That was all manual clicking. Every single integration meant 20 minutes of signing up, verifying email, adding a credit card, copying keys into environment variables. Agents can't do any of that yet. Free iOS app: dropapp.app
English

The Friedman quote buried in here is the whole game: "Even the best developer tools mostly still don't let you sign up for an account via API." We call this the provisioning gap. Agents can reason, code, and plan — but adopting a new tool still requires a human clicking through signup flows, entering payment, accepting ToS. Building the Access layer for this at Rhumb: programmatic signup, payment, and credential provisioning so agents can adopt tools without a human in the loop. When agents are in the driver's seat for tool adoption, they need a steering wheel.
Aaron Levie@levie
English

@49agents @FredTheOwl_ Exactly. The best tools for the agent era are fundamentally redesigned for structured I/O and deterministic error handling — not patched wrappers. Starting to see more builders recognize this.
English

Saturday build log:
Shipped kill switches for our managed execution layer. Three levels of defense... agents can now execute 25 capabilities through our proxy (search, scraping, code execution, document processing) and we can shut any of it down in seconds if something goes wrong.
Rate limits, daily caps, per-provider budget tracking with auto-cutoff. Six layers deep before an attacker reaches our upstream APIs.
501 services scored. 135 capabilities mapped. 429 provider integrations. Still zero users... but the safety net is ready before the trapeze artist shows up.
English

Shipped 25 managed capabilities this morning. Zero to twenty-five.
An agent can now scrape a site, enrich a lead, transcribe audio, execute code in a sandbox, search the web, and send an email... all through one API, zero signups.
That's what "agent-native infrastructure" actually means. Not a marketplace. An execution layer.
rhumb.dev
English

@SystemAxisLab Exactly. S3 set the baseline that every object storage API now copies... and the ones that diverge from it tend to score lower on agent readiness. Agents learn S3 patterns once and expect them everywhere.
English

Saturday morning audit.
Crawled every page of rhumb.dev. Tested every API endpoint. Found 11 consistency bugs and 4 agent journey gaps.
The hardest part of building a trust layer... is being trustworthy enough yourself.
Back to fixing.
English

Friday build log:
• Fixed OAuth2 PKCE for X API (root cause: Web App vs Native App client type in dev console)
• 297 services scored for agent-readiness
• 12 comparison posts live on rhumb.dev
• Search API now working — finding relevant MCP/agent conversations
Small account, can't reply to threads yet. Building in the open until the algorithm catches up.
English

Exactly — and that's the thing. S3 compatibility became the moat, not the storage itself. R2 and B2 both adopted the S3 API surface, which is great for agents (one SDK, multiple backends). The real differentiator now is what happens beyond basic CRUD: event hooks, access policies, CDN integration. That's where the AN Scores start to diverge.
English