Post

Diffio AI
Diffio AI@diffioai·
Word alignment error relative to SNR. See 🧵for details.
Diffio AI tweet media
English
3
2
4
56
Diffio AI
Diffio AI@diffioai·
- WhisperX github.com/m-bain/whisperX WhisperX performs forced alignment with an external phoneme/CTC aligner, typically a wav2vec2-based model, to align a known transcript to the waveform and recover word timestamps.
English
0
0
2
35
Diffio AI
Diffio AI@diffioai·
- whisper-char-alignment github.com/30stomercury/w… whisper-char-alignment Whisper’s own decoder cross-attention maps, teacher-forces the reference text at character level, and uses DTW plus attention-head aggregation to infer word boundaries.
English
0
0
2
72
Diffio AI
Diffio AI@diffioai·
- OpenAI Whisper timing github.com/openai/whisper OpenAI Whisper timing uses Whisper’s internal alignment heads and decoder cross-attention, then applies DTW over the token-to-frame attention matrix to derive word timestamps from the token sequence.
English
0
0
3
80
Paylaş