Judgment Labs

5 posts

Judgment Labs banner
Judgment Labs

Judgment Labs

@JudgmentLabs

Monitor Your Agent's Behavior. Judgment helps track and judge the quality of agents in offline and online setups. Set up Sentry-style alerts and RL easily!

Beigetreten Nisan 2025
0 Folgt168 Follower
Judgment Labs retweetet
Alex Shan
Alex Shan@alexshander03·
@JudgmentLabs is hosting a pop-up poker night at @NeurIPSConf for RL folks tomorrow night (Tuesday, December 2)! ♠️ Spots are limited and we're prioritizing attendees who are passionate about/actively working on AI research in the RL, evals, and rubrics space. If you're looking for sharp and deep convos with elite RL talent, we'd love to see you! 🧡 Event link in replies
English
1
3
7
1.3K
shai 🌻
shai 🌻@shaiunterslak·
Half the good AI startups I know have tried one of the 100 evals platforms and then built their own in-house.
English
33
5
272
38.5K
Alex Shan
Alex Shan@alexshander03·
At @JudgmentLabs we've had the opportunity to work with countless AI agent teams building fantastic products. Measuring and understanding agent behavior has become a bottleneck to agent improvement and everyone knows it. However, few get this process right and most teams fall into common pitfalls. In this piece from our research team, we cover some failure modes of teams attempting to monitor their agents and what a proper future of agent quality measurement will look like. Link in replies
English
2
5
16
3.9K
hallerite
hallerite@hallerite·
who's building the wandb for LLM-RL trajectories
English
9
0
50
12.2K