Soynade Research
18 posts

Soynade Research
@soynade
The moon shines for everyone. https://t.co/hjdcnXFWAE
Katılım Kasım 2024
0 Takip Edilen104 Takipçiler

4e publication du mois de l'open-source de Soynade.
Oolel-Embed: un modèle permettant de récupérer des documents directement à partir de la parole, sans passer par des étapes intermédiaires coûteuses de reconnaissance vocale et de traduction.
Model: huggingface.co/soynade-resear…
Français

Release 3 of the Soynade Open Source Month.
Oolel-Voices: a speech generation model supporting voice cloning with expressive, modular control over tone and pace, making it suitable for content creation.
Try it now: huggingface.co/spaces/soynade…
Model: huggingface.co/soynade-resear…
English

Continued pre-training allows us to be more compute-optimal than Orange's model while significantly outperforming the base Meta/HuBERT-Base model.
We release the ASR fine-tuned model along with 100 hours of clean Wolof ASR data.
Models and dataset here:
huggingface.co/collections/so…

English

Release 2 of the Soynade Open Source Month.
A small foundational speech representation model for Wolof, continued pretrained from Meta/HuBERT on 860 hours of Wolof speech. This improves the ASR performance using only unlabeled speech data.
huggingface.co/soynade-resear…

English

- AfVoices-Translated: huggingface.co/datasets/soyna…
- FineWeb-Wolof-50k: huggingface.co/datasets/soyna…
- Oolel-Translator: github.com/soynade-resear…
Nederlands

Today we kick off Soynade's Open Source Month, four weeks of releasing models, datasets, and tools for African languages.
Learn more: soynade.ai/research/soyna…
The first release is live:
→ AfVoices-Translated: +200k Bambara-English speech translation dataset with acoustic tags.
English

Petite expérience intéressante que vous pouvez reproduire : générer du texte avec notre LLM 𝐎𝐨𝐥𝐞𝐥 et le vocaliser à l’aide du modèle Text-to-Speech de @galsenai.
Français

Oolel: A High-Performing Open LLM for Wolof
We trained Qwen2.5 on high-quality curated Wolof data, resulting in the best open-source LLM for this language.
Download and try the model: huggingface.co/soynade-resear…
Learn more: tinyurl.com/oolel
English
