TrainLoop

10 posts

TrainLoop banner
TrainLoop

TrainLoop

@TrainLoop_ai

Post training research

san francisco, ca Katılım Ocak 2025
25 Takip Edilen465 Takipçiler
TrainLoop retweetledi
Jackson Stokes
Jackson Stokes@jackson_stokes·
We partnered with @mercor_ai to test a simple idea: What if knowledge-work agents were just… coding agents? Result: +25% performance, 2x faster, cheaper, and new SOTA on APEX-Agents. @josancamon19
Jackson Stokes tweet media
English
6
9
100
16K
TrainLoop
TrainLoop@TrainLoop_ai·
+48% from a single RL step, and 100k rollouts from a single policy. @LoganGrasby wrote up his findings that off policy training with OAPL is more robust than we thought!
Jackson Stokes@jackson_stokes

can we train a model in single RL step? During recent experiments, @Logangrasby found that a single step of OAPL increased model performance from ~0 to 48% on a clinical reasoning and prediction task. Turns out, data staleness might matter less than we think. with @pathos :

English
0
0
4
440
TrainLoop
TrainLoop@TrainLoop_ai·
Task-specific training can be far more efficient than we realize. This work by @hasith_v explores the LoRA-trained solution space for GSM8k, finding a massive "plane" of solutions, representable in a single rank.
Jackson Stokes@jackson_stokes

We trained LoRA adapters of different ranks to understand training dynamics, finding that adapters for GSM8k live in a surprisingly vast, low-rank solution space. This hints that some model skills are easy to learn, and training is more forgiving than we think. @hasith_v 1/6 🧵

English
0
0
5
453
TrainLoop retweetledi
Joan Cabezas
Joan Cabezas@josancamon19·
🧵 Labs and VC's are throwing cash at RL environments, especially for computer and browser use. Yet, with just 4 customers and over 30+ vendors, is cloning every website in the world really the path to scale? of course not. Introducing TRACE: Trajectory Recording and Capture of Environments.
English
6
9
78
9.7K
Kiet
Kiet@FlyaKiet·
@TrainLoop_ai is one of those companies in our group office hour that has something special. They're hitting an exponential curve and are now hiring more cracked engineer! Had a great time onboarding @jackson_stokes and @mlpierce22 to Superset yesterday with @avimakesrobots
Kiet tweet mediaKiet tweet mediaKiet tweet media
English
2
1
10
3.9K
Elon Musk
Elon Musk@elonmusk·
There is a shortage of top notch air traffic controllers. If you have retired, but are open to returning to work, please consider doing so.
English
16.6K
32.9K
270.5K
64.3M
ThePrimeagen
ThePrimeagen@ThePrimeagen·
puppeteer is the single worst library ever written
English
98
19
1K
117.6K