SeeRole

2 posts

SeeRole banner

SeeRole

SeeRole

@SeeRoleTask

Joined Mart 2024

21 Following1 Followers

SeeRole

SeeRole@SeeRoleTask·28 Eki

■ Review 하기 #ThinkingMachines

Thinking Machines@thinkymachines

Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other approaches for a fraction of the cost. thinkingmachines.ai/blog/on-policy…

한국어

0

0

55

SeeRole retweeted

Elon Musk

Elon Musk@elonmusk·11 Ara

Very important concept!

The Rabbit Hole@TheRabbitHole

Seems like a good time to remind people:

English

17.7K

194.8K

54M

Discover

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry