devlord
1.2K posts





training data is starting to look like a zero knowledge proof problem. labs have to judge quality without seeing the full dataset or the QC pipeline behind it. vendors proxy quality with multi-rollout pass rates, small-model ablations, and downstream eval gains. but compute and iteration costs explode as environments and trajectories grow more complex. quality has no ceiling, and the best data is often the hardest to capture in a metric or explain in a writeup. huge alpha in making data quality more legible.



We’re launching @JudgmentLabs today and announcing $32M in funding. As AI agents take on more of the work that creates economic value, they generate massive amounts of production data: the clearest record of how they behave with users, software, and the real world. Judgment builds infrastructure for improving AI agents from production data.







It is estimated that you can get a single copy of every Pokémon card ever created in English TCG for around $350,000 (varies greatly depending on condition) It would be an incredible feat to crowd fund this through some type of DAO and hold this entire collection for 25-50 years 🤯. Finding ways to generate revenue to continue adding newly released cards over time. I feel like @KabutoKing_ could lead this 😏👀.





