Karen Duane

1 posts

Karen Duane

Karen Duane

@Keyu_Duan

Entrou em Mart 2024
2 Seguindo6 Seguidores
Karen Duane retweetou
Michael Qizhe Shieh
Michael Qizhe Shieh@michaelqshieh·
Greedy Coordinate Gradient is a useful method but takes a lot of time to run. We accelerated it by 5.6x using a method called probe sampling. The key idea behind probe sampling is to use a smaller draft model to filter unpromising candidates in the search. But the difficulty there is that smaller draft models don’t agree with the target models when the draft models are small, so we have found it to be very effective to measure the dynamic agreement between the smaller draft model and the bigger target model, hence the name “probe sampling”. Here is the paper: arxiv.org/pdf/2403.01251….
Michael Qizhe Shieh tweet media
English
1
9
36
11K