nullHawk

62 posts

nullHawk banner
nullHawk

nullHawk

@null_hawk

i make GPUs go brrr... | building voice @rumik_ai

参加日 Ağustos 2023
124 フォロー中40 フォロワー
nullHawk
nullHawk@null_hawk·
Here is something that ma boi has been cookin for a while... it's realy impressive!!!
English
0
0
1
154
Gauri Tripathi
Gauri Tripathi@Gauri_the_great·
Do I have a SOTA model ?
Gauri Tripathi tweet media
Català
2
0
18
874
VioP
VioP@AcousimHss·
type shi one has to do for data
VioP tweet mediaVioP tweet media
English
3
0
4
143
nullHawk
nullHawk@null_hawk·
Its been approx 20+ hrs and finally reduced my GRPO runtime from ~12.8 hrs to ~1.5 hrs. used vLLM inference on one GPU and DDP on the other 3, applied pretty much everything I had. using 3 GPUs for DDP, hosted vLLM inference on one GPU, with 8 samples generating parallely, along with codec tokens, grad_accum=4, data sharding across ranks, micro-batching within the loss function, TF32 tensor cores, BF16 autocast for forward pass, Flash SDP + mem-efficient SDP, cuDNN benchmark mode, BF16 reduced precision reduction... works well for Hopper architecture. also turned on gradient checkpointing, manual gradient all-reduce (instead of full DDP wrapping), plus a file-based vLLM watcher that restarts the inference server with fresh merged weights at every optimizer step from a clean process tree (to avoid NCCL conflicts with torchrun), with retry logic on generation calls during server. biggest debugging rabbit holes: vLLM V1 silently ignoring stop_token_ids (had to force V0 with VLLM_USE_V1=0), and merge_and_unload() with tie_word_embeddings=True dropping the trained lm_head during save, model generates infinite codec tokens and never stops. fix: untie before merge so both embed_tokens and lm_head are saved separately. GRPO on TTS is a different beast from text!
nullHawk tweet media
English
3
2
12
964
nullHawk
nullHawk@null_hawk·
though 50% of the time is spent by reward model
English
0
0
0
128
atulit
atulit@atulit_gaur·
@null_hawk I was supposed to apply 🤣
English
1
0
0
398
atulit
atulit@atulit_gaur·
anybody else's personal dms look like this? i have obviously not opened these links
atulit tweet media
English
58
10
455
15.7K
nullHawk
nullHawk@null_hawk·
just call the "LLM" a "Policy" voila!!! you are an RL guy now
English
0
0
1
101
Kadir Nar
Kadir Nar@kadirnardev·
The loss value of my new omni model is 1.5 👀
Kadir Nar tweet media
English
4
0
10
2.7K
nullHawk がリツイート
Rohan
Rohan@lets_dig_deeper·
meet priya. and for the next 47 secs listen to her story.
English
18
12
86
4.2K
Varun Deep Saini
Varun Deep Saini@varundeepsaini·
This is either the biggest fumble or the biggest bag
Varun Deep Saini tweet media
English
42
14
1.5K
116.2K
habib
habib@habibtwts·
eid shopping done
habib tweet media
English
1
0
4
66
nullHawk がリツイート
VioP
VioP@AcousimHss·
New blog!! From Air Pressure to Speech: Exploring classical TTS pipeline i walk you through how tts works from capturing sound froom microphone to getting wav files hope you guys enjoy ; ) @VioSIlverP/rkOZITb9Wx?utm_source=profile-share" target="_blank" rel="nofollow noopener">hackmd.io/@VioSIlverP/rk…
VioP tweet mediaVioP tweet mediaVioP tweet mediaVioP tweet media
English
0
5
12
1.8K
VioP
VioP@AcousimHss·
My yearly perfmaxxing post Ice cream was still better than this coffee tho 😛
VioP tweet media
English
1
0
3
93
nullHawk がリツイート
Andy
Andy@prompt_Tunes·
put out a new worklog on optimizing snake1d activation kernel in triton
Andy tweet mediaAndy tweet mediaAndy tweet mediaAndy tweet media
English
4
10
87
5.3K