TrainLoop

@TrainLoop_ai

Post training research

san francisco, ca Katılım Ocak 2025

25 Takip Edilen465 Takipçiler

TrainLoop retweetledi

Jackson Stokes@jackson_stokes·5d

We partnered with @mercor_ai to test a simple idea: What if knowledge-work agents were just… coding agents? Result: +25% performance, 2x faster, cheaper, and new SOTA on APEX-Agents. @josancamon19

English

100

16K

TrainLoop@TrainLoop_ai·29 Nis

+48% from a single RL step, and 100k rollouts from a single policy. @LoganGrasby wrote up his findings that off policy training with OAPL is more robust than we thought!

Jackson Stokes@jackson_stokes

can we train a model in single RL step? During recent experiments, @Logangrasby found that a single step of OAPL increased model performance from ~0 to 48% on a clinical reasoning and prediction task. Turns out, data staleness might matter less than we think. with @pathos :

English

440

TrainLoop@TrainLoop_ai·17 Nis

Task-specific training can be far more efficient than we realize. This work by @hasith_v explores the LoRA-trained solution space for GSM8k, finding a massive "plane" of solutions, representable in a single rank.

Jackson Stokes@jackson_stokes

We trained LoRA adapters of different ranks to understand training dynamics, finding that adapters for GSM8k live in a surprisingly vast, low-rank solution space. This hints that some model skills are easy to learn, and training is more forgiving than we think. @hasith_v 1/6 🧵

English

453

TrainLoop@TrainLoop_ai·9 Nis

Specialized models for the most important tasks!

Jackson Stokes@jackson_stokes

We post-trained MedGemma to be SoTA in visual medicine ddx, outperforming Opus 4.6, Gemini 3.1 and GPT-5.4 while running at ~1/30th the cost. @getnolla Part 1 - improving visual reasoning 🧵1/6

English

294

TrainLoop retweetledi

Joan Cabezas@josancamon19·16 Ara

🧵 Labs and VC's are throwing cash at RL environments, especially for computer and browser use. Yet, with just 4 customers and over 30+ vendors, is cloning every website in the world really the path to scale? of course not. Introducing TRACE: Trajectory Recording and Capture of Environments.

English

9.7K

TrainLoop@TrainLoop_ai·5 Ara

@FlyaKiet @jackson_stokes @mlpierce22 @avimakesrobots @superset_sh TrainLoop ❤️🤝 Superset

English

259

Kiet@FlyaKiet·5 Ara

@TrainLoop_ai is one of those companies in our group office hour that has something special. They're hitting an exponential curve and are now hiring more cracked engineer! Had a great time onboarding @jackson_stokes and @mlpierce22 to Superset yesterday with @avimakesrobots