Karen Duane

1 posts

Karen Duane

Karen Duane

@Keyu_Duan

Beigetreten Mart 2024
2 Folgt6 Follower
Karen Duane retweetet
Michael Qizhe Shieh
Michael Qizhe Shieh@michaelqshieh·
Greedy Coordinate Gradient is a useful method but takes a lot of time to run. We accelerated it by 5.6x using a method called probe sampling. The key idea behind probe sampling is to use a smaller draft model to filter unpromising candidates in the search. But the difficulty there is that smaller draft models don’t agree with the target models when the draft models are small, so we have found it to be very effective to measure the dynamic agreement between the smaller draft model and the bigger target model, hence the name “probe sampling”. Here is the paper: arxiv.org/pdf/2403.01251….
Michael Qizhe Shieh tweet media
English
1
9
36
11K