

Tinker
71 posts

@tinkerapi
I tink, therefore I am. Post-training API by @thinkymachines





Hello MJ1: The World's TASTIEST Judge Model Agent verification is the bottleneck to AI's progress. The field's ability to verify visual output lags far behind that of text, especially in matters of ~taste~. So we built the world's tastiest multimodal judge model, MJ1.

@aviral_kumar2 Great work Aviral! I also investigated this space using RLVR for reasoning models to adaptively budget their compute by problem difficulty using Thinking Machines’ Tinker (pranavviswanath.github.io/rlvr-reasoning/), found similar results!









Contextual AI used Tinker to post-train the planning behavior for a search agent. They land on a two-stage training recipe: On-Policy Distillation and GRPO with a CLP reward. Read more 👇


