Sabitlenmiş Tweet
Braintrust
658 posts

Braintrust
@braintrust
The observability layer for production AI.
Katılım Ağustos 2023
54 Takip Edilen6.6K Takipçiler

An eval platform is more than just a test runner. Evals require shared definitions of "good," reliable data pipelines, labelling workflows, versioning, and trust in results across many teams and model changes.
Hear about the design principles behind Braintrust's platform in this session from @aidotengineer.
English

Evals course module ten: building a multi-turn chat app.
Move from single-turn to multi-turn use-cases by building a chatbot CLI app with production logging. Use init_logger, wrap_openai, and @ traced to capture every conversation as a single trace.
More here → braintrustdata.link/evals-course-y…
English

Evals course module nine: how to analyze your eval results.
Learn about the four ways to analyze eval data: experiment comparison, Loop queries, the Braintrust MCP server, and manual filtering in the UI.
More here → x.com/braintrust/sta…
Braintrust@braintrust
English

Evals 101: a new course from Braintrust. Everything you need to know about evals, and how to do them yourself.
Module one: Why are evals important?
- the six most common problems developers face when shipping AI applications
- why traditional software thinking doesn't apply to AI
- how evals can fix these problems
English

For AI PMs, evals are the new PRD.
At @PLEDalliance Summit New York, Ameya Bhatawdekar discussed the new product development loop and how to translate every element of a traditional PRD into its eval equivalent.

English

If you're building AI products but aren't writing evals, this is the place to start. In Evals for engineers, solutions engineer Doug Guthrie will show you how to:
- Instrument an agent with the Braintrust SDK
- Look at traces across model calls, tool use, and outputs
- Build datasets from failure modes and write scoring functions
- Iterate on your prompt and measure quality over time
English
