Arthur

1.1K posts

Arthur banner
Arthur

Arthur

@itsArthurAI

The AI Performance Company. Arthur helps teams discover, govern, and innovate AI systems that perform and scale reliably.

New York, USA Katılım Ocak 2019
591 Takip Edilen2.2K Takipçiler
Sabitlenmiş Tweet
Arthur
Arthur@itsArthurAI·
☁️ Arthur is now available in @googlecloud ! Many of our customers are building on Google Cloud and leveraging the latest Gemini and agent frameworks, so we partnered with Google to make Arthur available directly within your GCP environment. This means data never leaves your GCP environment, procurement is seamless through the Marketplace, and deployment fits naturally into your existing workflows and stack. With the explosion of agents, teams lose visibility into which agents are running and lack insight into failures. As enterprises race to adopt Agentic AI, a comprehensive agentic governance approach is crucial to preventing chaos, security nightmares, and business continuity issues. That’s why we launched Arthur’s Agent Discovery & Governance (ADG) Platform on Google Cloud. With Arthur on Google Cloud, you can: 🔍 Automate Discovery: Instantly find and catalog agents company-wide 📈 Unify Monitoring: Monitor and govern internally-developed and third-party agentic solutions 🛡️ Centralize Policy Management: Enforce acceptable use and security policies for all agent interactions 🔄 Continuously Evaluate: Monitor performance aligned specifically to agent tasks Read full announcement → arthur.ai/blog/arthur-la…
Arthur tweet media
English
0
2
2
377
Arthur
Arthur@itsArthurAI·
Our FDE team sees this repeatedly: teams that experiment systematically don't just fix bugs, they ship faster and build more trust 🤝 Read Part 4 here 👇 arthur.ai/blog/best-prac…
English
0
0
0
56
Arthur
Arthur@itsArthurAI·
In Part 4 of our Best Practices for Building Agents series, we cover: → Testing changes in isolation and measuring real impact → Using supervised evals to prevent regressions before deployment → How teams use experiments to build trust with customers (3/4)
English
1
0
0
279
Arthur
Arthur@itsArthurAI·
Continuous evals tell you your agent is failing, but they don't tell you how to fix it. That's where experiments come in 🧵 (1/4)
English
2
0
12
147.3K
Arthur
Arthur@itsArthurAI·
🚀 We had a full house last week at the Future of DevEx NYC with insightful convos between AI builders, engineers, and founders. ⚡ From AI-powered workflows to observability in production, the energy in the room was unmatched. Massive thank you to our co-hosts @Deskree, @PostHog, @Grafana, and everyone who showed up and made this one unforgettable. NYC, we'll be back soon 👀
Arthur tweet mediaArthur tweet mediaArthur tweet mediaArthur tweet media
English
0
0
1
60
Arthur
Arthur@itsArthurAI·
Hardcoded prompts get you to a demo. Managed prompts get you to production. There's a big gap between the two. ⚡ Hardcoded prompts create real problems at scale: → No version history — someone tweaks a prompt and nobody tracks it → No rollbacks — bad changes mean full redeployments → No environment separation — staging drifts from production → No model flexibility — swapping providers means rewriting code Arthur's Prompt Management system fixes this by externalizing prompts from your codebase with versioning, environment tagging, and instant one-click rollbacks. If you're tired of treating prompts like throwaway strings in your repo, learn how to setup your agent with Arthur’s Prompt Management system: arthur.ai/blog/prompt-ma…
English
0
0
1
48
Arthur
Arthur@itsArthurAI·
Shipping prompt updates without testing them is like pushing code without running your test suite. You might get lucky. Or you might break something that takes weeks to notice. ⚡Arthur's Prompt Experiments feature brings engineering rigor to prompt iteration allowing you to create: → Golden datasets — curated test cases with expected outputs → Custom evaluators — automated checks for specific behaviors → Version comparison — run the same tests across prompt versions → Regression detection — know exactly what broke during which change Our CTO, Zach Fry, did a walkthrough showing how Arthur allows you to build prompt experiments to ship updates with confidence. If you're tired of vibe-testing your prompts, this is how you move to quantifiable evals. #aiagents #aievals #PromptEngineering
English
0
0
0
86
Arthur
Arthur@itsArthurAI·
As part of our series on building reliable AI agents in production, Part 3 is live: Continuous Evaluations. Most teams don't know their agent is misbehaving until a user files a complaint. By then, trust is already damaged. Test suites help, but they're not enough as agents are non-deterministic. Production traffic is messier and more diverse than any handwritten test set. You need automated checks running against real interactions. Our FDEs see this pattern across accounts: teams that adopt continuous evals ship faster and build more user trust. Teams that don't are stuck playing whack-a-mole with bug reports. In Part 3, we cover: → Supervised vs unsupervised evals (and why only one works in production) → Why binary pass/fail beats scoring on a range → How to write evals specific enough for an LLM to judge reliably → Calibrating edge cases with examples → Fitting evals into your team's workflow arthur.ai/blog/best-prac…
English
0
0
1
42
Arthur
Arthur@itsArthurAI·
Most teams hardcode prompts directly into their agent's codebase. That works in demos. It breaks at scale. Our CTO, Zach Fry, did a walkthrough showing how Arthur handles prompt management in production — the same tool our Forward Deployed Engineering team deploys with enterprise customers every day. That means: → External prompt storage — iterate on prompts without redeployments → Versioning & rollback — tag by environment, promote to prod, revert instantly → Conditional templating — build dynamic, context-aware prompts at runtime → Regression testing — validate changes against real production interactions before they ship If you're building agents that need to make it past the demo stage, prompt management is non-negotiable.
English
0
1
0
87
Arthur
Arthur@itsArthurAI·
Prompts define how your agent behaves. Hardcoding them into your codebase means every tweak requires a full redeploy, changes go untracked, and testing is nearly impossible without running the whole stack. A prompt management tool helps you version, tag, test, and rollback prompts, ensuring fewer regressions. Here are the three non-negotiables your prompt management tool requires: ☁️ External Storage: Keep prompts outside your codebase to update prompt logic without redeploying. ⚒️ Version Control: Maintain a full history of changes so you can roll back instantly. 📄 Templating: Build dynamic, reusable prompt structures instead of static text blocks. Check out our new deep dive on prompt management and agent building best practices: arthur.ai/blog/best-prac… #LLMOps #AIEngineering #DevOps #GenAI
English
0
1
0
62
Arthur
Arthur@itsArthurAI·
The number one indicator of getting your agent to production? Observability. We distilled lessons from our FDE team's real-world agent deployments that actually make it to production. The teams that shipped with confidence all had one thing in common: visibility into what their agents were actually doing. Check out Part 1 of our Best Practices for Building Agents: Observability and Tracing: arthur.ai/blog/best-prac… #AgentEngineering #OpenTelemetry #AgentDevelopmentLifecycle
English
0
0
0
100
Arthur
Arthur@itsArthurAI·
If you can’t discovery agents in your cloud environment, you can’t govern them. We are seeing a massive rise in "Shadow AI." Teams are spinning up agents on various different environments and tech stacks, but this is creating a governance nightmare. On Wednesday, Arthur CEO, Adam Wenchel is tackling the hardest part of the agentic future: Discovery and Governance You’ll learn how to: 1️⃣ Eliminate blind spots by cataloging every agent automatically. 2️⃣ Protect PII and IP without slowing down development. 3️⃣ Solve policy fragmentation across multi-cloud environments. RSVP now [link in comments] 👇 PS - Can’t make it? All RSVPs will get the recording after, so signup anyways #aigovernance #ai #aiagents
Arthur tweet media
English
2
0
0
62
Arthur
Arthur@itsArthurAI·
2025 was the "Year of the Agent." Companies planted 1,000 flowers to see what would bloom. 🌸💥 The result? Innovation, yes. But also massive sprawl, fragmented policies, and little real oversight. The question for 2026 isn't "how do we build agents?" - it's "how do we govern agents?" Next Wednesday, our CEO, Adam Wenchel, is hosting a deep dive on building successful Agent Discovery & Governance (ADG) strategies. You’ll learn how to: 🔎 Uncover Agent Sprawl: How to automatically discover and continuously catalog agents across your enterprise. 🛡️ Standardize Governance: How to ensure consistent security, compliance, and acceptable-use controls across every agent. 🚀 Unify Operational Control: How to gain a "single pane of glass" view of agent behavior, performance, and risk. Join us to turn governance into your competitive advantage. RSVP link in the comments! 👇
Arthur tweet media
English
1
0
0
51
Arthur
Arthur@itsArthurAI·
⚡NOW AVAILABLE: Agent Discovery & Governance (ADG) Platform on @googlecloud Arthur is now officially available on the Google Cloud Marketplace, bringing production-grade governance to one of the most trusted cloud environments. By running natively within Google Cloud, Arthur allows enterprises to automate discovery of agents, govern internally-developed and third-party agentic solutions, enforce acceptable use policies, and ultimately continuously evaluate and improve agents. See why Google Cloud and Arthur are a key part of the agentic AI stack (link in comments)
Arthur tweet media
English
1
0
0
55