
Samuele Cornell
181 posts

Samuele Cornell
@SamueleCornell
Post-doc @ CMU LTI. Audio and speech researcher.






We are thrilled to announce the Interspeech 2025 URGENT Challenge, starting on 11/15! Join us in building universal speech enhancement models to tackle in-the-wild speech data using large-scale, multilingual data. Details: urgent-challenge.github.io/urgent2025/

We're organizing a special issue at Computer Speech & Language about Multi-Speaker, Multi-Microphone, and Multi-Modal Distant Speech Recognition. Deadline: December 2, 2024 #multi-speaker-multi-microphone-and-multi-modal-distant-speech-recognition" target="_blank" rel="nofollow noopener">sciencedirect.com/journal/comput…
@chimechallenge





Today, we release several Moshi artifacts: a long technical report with all the details behind our model, weights for Moshi and its Mimi codec, along with streaming inference code in Pytorch, Rust and MLX. More details below 🧵 ⬇️ Paper: kyutai.org/Moshi.pdf Repo: github.com/kyutai-labs/mo… HuggingFace: huggingface.co/kmhf




We are thrilled to announce the URGENT 2024 Challenge - a new speech enhancement (SE) competition at NeurIPS 2024: urgent-challenge.github.io/urgent2024 This challenge aims to unify diverse distortions and sampling frequencies using a single universal SE model. #URGENT2024 (1/4)

This is the second call for papers about the SynData4GenAI workshop. Please mark your calendar for the submission due date (June 18, 2024, after the Interspeech acceptance notification)! I'm also pasting the CFP.


So far, we only used SURT for transcription, without worrying about speaker labels. In Ch. 7, we show how to jointly perform transcription and streaming speaker attribution in the SURT framework. This work has been submitted to Odyssey'24: arxiv.org/abs/2401.15676 9/n


