Portkey

1.5K posts

Portkey

@PortkeyAI

for those who take the leap towards production AI

SF, BLR Katılım Mart 2023

1.2K Takip Edilen1.8K Takipçiler

Sabitlenmiş Tweet

Portkey@PortkeyAI·19 Şub

We just raised our $15M Series A to scale our unified control plane for production AI. AI is now mission-critical infrastructure, and we’re building the reliability layer so it never breaks. Thanks to our investors @ElevCap + @lightspeedvp portkey.ai/blog/series-a-…

English

Portkey@PortkeyAI·6h

The Agent Gateway webinar is TODAY. A few hours away, still time to grab a spot. If you're running agents in production you should sign up, now! portkey.sh/agent-webinar

English

Portkey@PortkeyAI·1d

Agent Gateway is in beta. One production layer for every agent in your org. Governance, observability, and access control from a single control plane. Live webinar demo this Friday. Register now: portkey.sh/agent-webinar

English

127

Portkey@PortkeyAI·2d

Agent Gateway was built to close this gap. One place to register, govern, and observe every agent running in production. Join us this Friday, April 24: portkey.sh/agent-webinar

English

Portkey@PortkeyAI·2d

The first agent always works. The problems begin somewhere between agent five and agent fifty. Teams lose track of what is running. Permissions drift. Chains fail silently, and by the time anyone notices, it is hard to trace back to where things went wrong.

English

Portkey@PortkeyAI·2d

On April 24 we're hosting a webinar covering everything about Agent Gateway. If you're running agents in production, this is the one hour to spend. Register now : portkey.sh/agent-webinar

English

Portkey@PortkeyAI·2d

Out of the box: - Cost controls and budget limits per agent, team, and user - Observability across 40+ metrics, every LLM and MCP call - Access control enforced in real time - 50+ guardrails for PII, PHI, and content moderation - Instant policy changes, no redeployment Works with LangChain, CrewAI, OpenAI Agents SDK, and whatever you're already running. Docs : portkey.sh/agent-docs

English

Portkey@PortkeyAI·2d

Agent Gateway is live in beta Register any agent, get a governed Portkey endpoint, and every action flows through a single control plane.

English

323

Portkey@PortkeyAI·4d

The full story on how Portkey and Conductor fit together - why we built it, how the setup works, and what you actually get out of it. portkey.sh/conductor-x-po…

English

Portkey@PortkeyAI·4d

Portkey + @conductor_build are now integrated 🤝 Conductor gives you parallel Claude Code sessions with git worktree isolation. Portkey gives every one of those sessions full observability, budget controls, and cross-provider fallbacks. Setup takes 5 minutes → portkey.sh/conductor-docs

English

177

Portkey@PortkeyAI·17 Nis

The march supply chain attack was live for 6 hours. No warning. No symptoms. Just a silent credential harvest running every time Python started. Cloud credentials, SSH keys, Kubernetes configs, database passwords. All gone from a single pip install. @jumbld broke down what this means for everyone building with AI. Read it here : portkey.sh/litellm-attack

Rohit Agarwal@jumbld

The LiteLLM attack wasn't clever. A .pth file in site-packages, an unpinned Trivy dependency in CI, and credentials sitting in the exact default locations every tutorial tells you to use. It worked because the AI ecosystem normalized this setup. Wrote about all of it. If you're running anything in production that touches LLM APIs, it's worth your minutes. portkey.sh/security-bill-…

English

510

Portkey@PortkeyAI·16 Nis

🚀 Claude Opus 4.7 is live on Portkey! @AnthropicAI's most capable Opus model showed improvements in complex and long running coding tasks. Try it today!

English

235

Portkey retweetledi

Siddharth@siddhxrth10·16 Nis

the diagram is incomplete. i've fixed it

OpenAI Developers@OpenAIDevs

Build long-running agents with more control over agent execution. New capabilities in the Agents SDK: • Run agents in controlled sandboxes • Inspect and customize the open-source harness • Control when memories are created and where they’re stored

English

575

Portkey@PortkeyAI·16 Nis

Every model you switch to comes with full logs, cost tracking, budget caps, and fallover. 3000+ models, one API key.

English

Portkey@PortkeyAI·16 Nis

Pi is the fastest, leanest coding agent out there. We measured it: ~2,600 tokens per turn vs ~27,000 for Claude Code. Same task. The catch? Zero visibility without a gateway in the way. That's fixed now. portkey.ai/docs/integrati…

English

433

Siddharth@siddhxrth10·15 Nis

LLM Pricing Is 100x Harder Than You Think why do @karpathy, @thdxr, @simonw, @swyx, @opeclaw all maintain their own model pricing data? because nobody else has solved it. we've tracked $180M+ in llm spend on Portkey's gateway. here's why it breaks, and how we solve it 🧵 Every project I look at maintains its own model data. A JSON with names, prices, maybe context limits. @karpathy has one in LLM CLI. @thdxr built models.dev for OpenCode because there was no canonical source for pricing and capabilities across providers. @simonw built llm-prices.com and has 67+ blog posts tracking pricing changes. @swyx tracks model releases through AI News at Latent Space almost daily. LibreChat has one. Pi has one. @theo's T3 code has one. Everyone rebuilds this independently. It works for a few weeks, then drifts. And it's not just pricing. Model names change. Context limits get updated silently. New billing dimensions show up and nobody's JSON has a field for them. At Portkey we've tracked $180M+ in LLM spend across 3,500+ models over three years. The naive formula says cost = tokens × rate. That's the tip of the iceberg. Below the surface, 6 patterns break every in-house implementation I've seen: 1/ Thinking tokens @OpenAI's o3, @AnthropicAI's Claude with extended thinking: tokens consumed for internal reasoning that never appear in the response. You still get billed. If your system only counts visible output tokens, you're undercounting agentic workloads by 30-40%. 2/ Cache asymmetry Anthropic charges 25% MORE for cache writes. OpenAI charges nothing for writes. Apply a single "cache discount" multiplier across both and your numbers are wrong for at least one provider. Looks like a rounding error until you're processing millions of cached requests a day. 3/ Context thresholds Cross 128K tokens and per-token cost silently doubles on many models. $0.075/M becomes $0.15/M. Nothing in the API response tells you which tier you hit. Openai, Anthropic, Google all do this. Almost no one accounts for it. 4/ Same model, different prices Kimi K2.5: $0.50/$2.80 on @TogetherCompute, $0.60/$3.00 on Fireworks. @awscloud Bedrock prepends regional prefixes that need stripping before you can look up the price. @Azure returns deployment names instead of model identifiers. You need an extra API call just to figure out what model you're running. 5/ Non-token billing DALL·E 3 bills by resolution. Video charges per second. Realtime audio has separate input/output rates. Fine-tuning is per-token on some models, per-hour on others. "cost = tokens × rate" was never the full picture. 6/ New dimensions keep appearing Started with 2 billing dimensions. Now there are 20+. Web search has per-query pricing. Googles's Grounding with Search has its own rate. Tool use, code execution, each has its own cost model. New ones show up faster than providers update their docs. The whole industry is focused on harness design, agents, benchmarks. Meanwhile every project maintains its own model metadata in a JSON that's stale by the time you ship. There's literally open issues on OpenCode asking for better models[dot]dev integration because manually tracking this is "tedious and error-prone." @stripe clearly sees this problem too. They just launched their own AI Gateway specifically for LLM token billing. Same core insight: the centralized proxy is the natural place to solve cost attribution. We've been solving this at Portkey for three years. We open-sourced all of it 👇

English

976

Portkey@PortkeyAI·15 Nis

@siddhxrth10 @karpathy @thdxr @simonw @swyx we're also releasing cost attribution on OS gateway: github.com/portkey-ai/gat…

English

Portkey@PortkeyAI·14 Nis

We're calling it the Harness Tax. Every agent has one. Almost nobody is measuring it. Measure yours by routing your agent through Portkey's gateway. Full breakdown → portkey.sh/SnEj9sp

English

164

Portkey@PortkeyAI·14 Nis

It compounds fast. A 40-turn coding session with Claude Code burns ~1.1M input tokens. Roughly half is harness overhead you never asked for.

English

165

Portkey@PortkeyAI·14 Nis

We routed three coding agents through Portkey's gateway and logged every token. Claude Code used 10x more than Pi on the exact same task. OpenAI Codex wasn't far behind. Here's where it all went...

English

278

Keşfet

@conductor_build @jumbld @AnthropicAI @karpathy @thdxr @simonw @swyx @theo