
Adam Łucek
250 posts

Adam Łucek
@AdamRLucek
i like making things | Applied AI @LangChain













Trace data is literally worth its weight in gold these days, if you know what to do with it! As has been established, creating effective agents requires shipping early, observing behavior, and iterating quickly. At the core of this are your agent traces capturing exact inputs, outputs, steps, and metadata along the way. Analyzing traces helps surface inefficiencies and areas for improvement, but they can also be used in more sophisticated ways to set up robust evaluations. Here's two of the ways we use traces to build evals for production agents 👇






Trace data is literally worth its weight in gold these days, if you know what to do with it! As has been established, creating effective agents requires shipping early, observing behavior, and iterating quickly. At the core of this are your agent traces capturing exact inputs, outputs, steps, and metadata along the way. Analyzing traces helps surface inefficiencies and areas for improvement, but they can also be used in more sophisticated ways to set up robust evaluations. Here's two of the ways we use traces to build evals for production agents 👇




i feel like there's a general misunderstanding about open source models. most people use a frontier model, switch the api request to open source model, see poor performance, and then churn off. this will never work. you have to spend the time to handhold these models in the tasks you're trying to accomplish. basically every coding agent that you use is tuned to output prompts in the format that these frontier models except, and perform best on. if you invest the time to add custom prompting for these OSS models, you'll see the improvement performance, but it'll never work out of the box.



