Yuancheng Wang รีทวีตแล้ว
Yuancheng Wang
7 posts

Yuancheng Wang
@yuancwang
Ph.D. student for speech at CUHKSZ; TTS, Audio, Speech LLM; Amphion, NaturalSpeech 3, MaskGCT.
เข้าร่วม Mart 2024
71 กำลังติดตาม17 ผู้ติดตาม
Yuancheng Wang รีทวีตแล้ว
Yuancheng Wang รีทวีตแล้ว
Yuancheng Wang รีทวีตแล้ว

🚀🚀🚀 A Zero-Shot TTS model MaskGCT (Masked Generative Codec Transformer) is open-sourced in Amphion now. Trained with Emilia. Only needs 5 sec speech to clone
Paper: arxiv.org/abs/2409.00750#
HF: huggingface.co/spaces/amphion…
Discord: discord.gg/fRaQpH7s
Watch the demo by MaskGCT
English

@mohamed17381489 @realamphion @xutan_tx Hi, since the codec is design for tts task, we only achieve vc by replacing speaker embedding in the codec decoder, we will improve it recently!
English
Yuancheng Wang รีทวีตแล้ว

Amphion now supports the FACodec, which is the core component of NaturalSpeech3 and the pretrained checkpoints are released.
Paper: arxiv.org/abs/2403.03100
Checkpoints: huggingface.co/amphion/natura…
Demo: huggingface.co/spaces/amphion…
Code: github.com/open-mmlab/Amp…
@xutan_tx @yuancwang
AK@_akhaliq
Microsoft presents NaturalSpeech 3 Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models While recent large-scale text-to-speech (TTS) models have achieved significant progress, they still fall short in speech quality, similarity, and prosody. Considering
English

