chriscooning retweetledi
chriscooning
43 posts

chriscooning
@chriscooning
AI-Native PMM @arizeai Marketing is a product. Build accordingly.
California, USA Katılım Temmuz 2018
125 Takip Edilen27 Takipçiler
chriscooning retweetledi
chriscooning retweetledi

arize and shine, new york city!
Arize x Times Square. Agents don't work without evals. Trace it, Evaluate it, Fix it.
Tag @arizeai if you catch it in the wild!
English
chriscooning retweetledi
chriscooning retweetledi
chriscooning retweetledi
chriscooning retweetledi
chriscooning retweetledi
chriscooning retweetledi
chriscooning retweetledi
chriscooning retweetledi

Seminal article on harness abstraction from Anthropic.
Old:
One container, if it dies, session is lost
Credentials exposed to untrusted code
New:
Harness is stateless, sandboxes are disposable
Inference starts instantly
Credentials never reachable
anthropic.com/engineering/ma…
English
chriscooning retweetledi
chriscooning retweetledi

@karpathy We need a gaming-style interface for agentic IDE, something like StarCraft II, mini map is helpful. Later, agents moving data between systems will feel more like Factorio.
For now, I’m just using more monitors.
x.com/Yuchenj_UW/sta…
Yuchen Jin@Yuchenj_UW
Before the ideal agentic engineering IDE arrives, here’s my setup:
English
chriscooning retweetledi

Excited to share arize-skills today!
You guys loved the phoenix ones, and we rolled them out for Arize AX as well now. These skills have completely changed how you interface with Arize.
You can instrument your agent, go debug a trace, run experiments and evaluate them - all with these skills. It's 2026 - we're building for agents, and these are an important part of that experience.
Get started today!
npx skills add Arize-ai/arize-skills --skill "*" --yes
arize.com/blog/arize-ski…
English
chriscooning retweetledi

Learn how to build a production agent from our own real experience!
Structured planning is what turns an agent from a tool executor into a workflow orchestrator.
Here's what worked for us:
- Planning as structured tool calls, not prompt instructions
- Plan pinned after the system prompt on every loop iteration
- 4 task statuses: pending, in_progress, completed, blocked
- A hard gate that prevents finishing with incomplete tasks
Part 1 of our "How We Built Alyx" deep dive series: arize.com/blog/how-to-bu…
English
chriscooning retweetledi

Alyx can surface insights about your traces in the Arize UI. I wanted to do the same thing from my terminal.
Pulled Alyx's own spans with the AX CLI, dropped the file into Cursor, and asked it what the most common user questions are.
Same analysis. No browser.
We just released a developer preview: arize.com/blog/ax-cli-de…
pip install arize-ax-cli
English
chriscooning retweetledi
chriscooning retweetledi

Alyx 2.0 is live.
An AI engineering agent built into Arize AX that can reason across multi-step workflows and execute autonomously.
↳ Error analysis
↳ Prompt experimentation
↳ Trace debugging
No more stitching everything together by hand.
Most AI assistants in dev tools follow one decision tree from the top. Alyx breaks down complex tasks, maintains context, and acts across your entire AI lifecycle.
Learn more on the blog: arize.com/blog/alyx-2-0-…
English
chriscooning retweetledi

Last December @karpathy went from 80% manual coding to 80% agent-assisted in a single month. His takeaway: the intelligence is ahead of the infrastructure.
I've been thinking about this a lot. We're living through two shifts at the same time, and I don't think we're talking about them together enough.
1️⃣ The first is who writes the code. Coding agents like @claudeai Code are now responsible for somewhere between 16-23% of @github contributions, and that number is climbing fast.
2️⃣ The second is what the code does. Modern applications aren't purely code anymore. Behavior emerges from the interaction of code, model weights, and prompts — and it varies at runtime in ways we've never had to account for.
My belief is that we need to connect Coding Agents to our traces and evals for them to become truly useful.
If you want to go deeper, I'll be demoing and having a lively discussion about this with @HamelHusain in a free webinar where we'll get into the nuts and bolts — automating evals with Claude Code and @ArizePhoenix.
Hamel Husain@HamelHusain
If you find AI evals painful, this is for you. We'll show you how to use claude code to automate things **without stepping on footguns** (there are many) with @mikeldking, creator of Phoenix, one of my fav OSS tools maven.com/p/2c8410/autom… Recording sent to all signups.
English

