Dialectra
1.2K posts

Dialectra
@_dialectra
Collect, validate, and benchmark speech data for African languages to build better voice AI. Public TG https://t.co/IkVeY03wTN
Africa Tham gia Ekim 2022
9 Đang theo dõi4.4K Người theo dõi

AI is only as powerful as the data behind it.
But most of the world's languages are still missing from the AI revolution.
@_Dialectra is helping communities preserve, verify, and contribute native language data, giving every language a place in the future of AI.

English

Today, we’re excited to officially launch our Yoruba speech data campaign on Dialectra.
Over the past two months, we’ve seen contributors across Hausa, Kanuri, and Fulfulde help us build one of the fastest-growing African speech data communities.
Now it’s time to expand.
With Yoruba joining Dialectra, contributors can now participate in:
• Corpus script recordings
• Transcription tasks
• Live conversational speech through Dialect Connect
As always, every contribution goes through our transcription, annotation, standardization, and human verification pipeline before becoming training-ready datasets.
Yoruba is one of Africa’s most influential and widely spoken languages, yet high-quality conversational speech infrastructure for it remains limited.
We want to help change that.
If you speak Yoruba, you can now join Dialectra, contribute your voice, and help shape the future of African speech AI while earning rewards for your contributions.

English
Dialectra đã retweet

Submitted a proof of personhood for an accelerator while in a hospital bed… all for cloud credits to keep building as a bootstrap founder, we are really build from everywhere 😄
It's a really interesting journey about @_dialectra
English

Understanding how Dialectra works is one thing. Participating in it is another.
@_dialectra is not just a system built for users - it is a system built with users.
Every dataset begins with real people contributing their voice, their dialect, and their natural way of speaking. That is the entry point into the system.
And it starts with something simple: joining and becoming a contributor.

English
Dialectra đã retweet

Today I'm going to talk about @_dialectra
Not in English, but in Hausa.
As someone who uses AI almost every day and spends a lot of time exploring AI tools and projects, I noticed something interesting.
Yawancin AI voice agents suna iya magana da Hausa, amma idan ka zurfafa ka duba, da yawa daga cikinsu ba sa fahimtar yadda Hausawa ke magana a zahiri.
Hausar Kano daban.
Hausar Katsina daban.
Hausar Sokoto daban.
Har ma kalmomi, karin magana da lafazi suna canzawa daga yanki zuwa yanki.
Anan ne Dialectra ta bambanta.
Maimakon su mayar da hankali kawai wajen gina AI mai magana, suna tattara sahihin bayanan murya daga masu magana da Hausa na gaskiya.
Ba wai karatun rubutu kawai ba.
Suna tattara yadda mutane ke magana a rayuwa ta yau da kullum, da lafazi, da karin magana, da bambancin yare daga yankuna daban-daban.
Wannan yana da muhimmanci saboda AI ba zai iya fahimtar abin da bai taba koya ba.
Idan bayanan da aka horar da shi da su ba su wakilci Hausawa na gaskiya ba, to ko da model ɗin ya yi ƙarfi, zai yi kuskure idan ya gamu da ainihin masu amfani.
Abin da ya fi daukar hankalina shi ne cewa Dialectra tana gina foundation ne, ba kawai wani voice AI app ba.
Yau muna magana da @ElevenLabs, @Hailuo_AI da sauran voice AI platforms.
Amma ka taba tunanin me zai faru idan irin waɗannan manyan platforms suka samu damar amfani da ingantattun bayanan Hausa da Dialectra ke tattarawa?
Me zai faru idan AI zai iya gane Hausar Kano, Katsina ko Sokoto ba tare da rikicewa ba?
Me zai faru idan AI zai iya fahimtar yadda Hausawa ke magana a zahiri, ba kamar yadda littafi ya rubuta Hausa ba?
A ganina, wannan shi ne babban abin da ya sa Dialectra ta bambanta.
Ba wai kawai tana gina AI ba.
Tana gina bayanan da za su taimaka wa AI fahimtar Hausa yadda ya kamata.
Kuma hakan na iya zama babban mataki ga Hausa da sauran harsunan Afirka a duniyar AI.

English

A few days ago, we launched Dialect Connect — a simple way for people to have real conversations while contributing to African speech datasets.
Here’s where things stand already:
• 896 total conversation requests
• 703 completed conversations
• 107.4 hours of conversational speech collected
• 12 pending
• 8 active
• 173 rejected
Alongside this, our corpus reading and transcription workflows have now crossed more than 300,000 voice samples collected from Hausa-speaking contributors across our platform.
What matters to us is not just collecting audio.
The difficult part is what happens after collection.
Every contribution inside Dialectra.io goes through a structured pipeline:
→ Transcription
→ Annotation
→ Standardization
→ Human verification
→ Approval
We built this because raw voice recordings alone are not enough to train reliable speech systems.
Models need properly reviewed transcripts, dialect-aware normalization, quality checks, and consistent formatting before the data becomes useful for training.
This is where many African language datasets struggle.
A lot of existing datasets are either scraped, weakly labeled, inconsistent, or missing conversational context entirely.
We are trying to approach this differently.
Dialectra is focused on building speech datasets that reflect how people actually speak — accents, dialects, pauses, code-switching, natural conversations, and regional differences included.
For voice AI startups and model builders, this matters more than dataset size alone.
Better infrastructure produces better models.
We’re still early, but it’s exciting seeing contributors across Hausa-speaking communities helping shape what this can become.
More updates soon.

English
Dialectra đã retweet

If you want to learn more about Dialectra and why US..
Don't miss out this AMA session.
Crypto Solutions 🕊️@creptosolutions
Clock it ogs...🔥
English
Dialectra đã retweet

We launched "Dialect Connect" yesterday and in just 24hrs, the stats is really impressive.
I was thinking few days ago a simple idea: what if we could capture how people actually speak, not just how they read?
I then implements yesterday as a additional feature for Dialectra.io
24 hours later:
📞 371 conversation requests
✅ 303 completed conversations
🎙️ 45.1 hours of conversational speech collected
⏳ 5 pending
🟢 2 active
❌ 61 rejected
For years, most speech datasets have been built around scripted recordings. They are useful, but they only tell part of the story.
Language lives in conversations.
It lives in pauses, interruptions, storytelling, laughter, code-switching, local expressions, and the unique rhythm that makes every dialect different.
The future of voice AI will not be built solely on people reading sentences from a screen. It will be built on authentic human interactions.
That is what excites me most about these numbers.
In just 24 hours, hundreds of people chose to connect with complete strangers or friends and simply talk. In doing so, they generated something incredibly valuable: real-world conversational data for African languages and dialects.
Every completed conversation moves us closer to a future where AI can understand not only what Africans say, but how we say it.
When we started Dialectra, our mission wasn't just to collect voice data. It was to ensure that African languages, dialects, and identities are represented in the AI systems that will power the next generation of technology.
45.1 hours is a small number compared to where we're going.
But it's a reminder that the infrastructure for African voice AI won't be built in a lab alone. It will be built by communities, contributors, and everyday conversations happening across the continent.
We're still very early.

English
Dialectra đã retweet

Africa is home to some of the world’s most spoken and culturally influential languages, yet modern AI systems still struggle to understand them accurately.
Hausa alone is spoken by an estimated 80 to 100 million people across West and Central Africa, particularly in Nigeria and Niger. Swahili, widely recognized as Africa’s leading lingua franca, connects more than 200 million speakers across East and Central Africa.
Arabic, one of the continent’s most dominant languages, is spoken by hundreds of millions across North Africa and parts of the Sahel, shaping commerce, education, religion, and communication throughout the region. Yet despite this enormous linguistic scale, African speech remains heavily underrepresented in global AI systems.
That is the gap @_dialectra is stepping in to solve building the speech infrastructure designed to help artificial intelligence truly understand how Africa speaks.

English

May this Eid bring peace, joy, and blessings to you and your loved ones. 🤍✨
#EidMubarak #Dialectra

English

Dialect Connect is now LIVE
You can now:
🤝 Find an online partner
📩 Send an invite to a friend
🎙️ Join live voice conversations
💰 Earn rewards for every completed session
Choose a topic or simply have a random conversation. Every discussion helps create high-quality conversational datasets for the next generation of African voice AI.
We're excited to see what the community creates.
Try it now: Dialectra.io

English

@_dialectra This is a welcome development and i think this will allow the users share raw and more advanced data freely.
English

Coming Soon: "Dialect Connect"
One of the most exciting features we've build so far at Dialectra.
Two verified contributors. One topic. A live conversation.
Not scripts.
Not prompts.
Real conversations. Real dialects. Real speech patterns.
We're moving beyond read speech and transcription into authentic voice interactions the kind of data needed to build the next generation of dialect-aware AI systems.
A big step toward our vision of building the infrastructure layer for African voice AI.
Stay tuned. More updates soon.

English

Good morning @_dialectra I want to ask you about some bug that I discovered in your dashboard page i.e your notification section is not functioning because whenever I clicked on it nothing seems to happen in that Area as you can see in the picture below.

English

We are celebrating BTC pizza day go and claim 50 DX points for free.
dialectra.io

Dialectra@_dialectra
From 10,000 BTC for two pizzas 🍕 to a global financial movement ₿ Today we celebrate the transaction that proved Bitcoin could power real-world value exchange and sparked a revolution in digital finance. Happy Bitcoin Pizza Day from Dialectra 💙 Building infrastructure for the future of finance. #BitcoinPizzaDay
English

Claim your Bitcoin Pizza Day slice of DX points at dialectra.io
Bakaka@Abba_kakaa
We are celebrating BTC pizza day go and claim 50 DX points for free. dialectra.io
English

Barkan mu da Juma’a 🤲
Shin ko kasan aikin da Dialectra yake haifarwa kuwa?
A sauƙaƙe, dialectra yana taimakawa AI wajen fahimtar yadda mutane ke magana a zahiri.
Lokacin da aka tattara bayanan murya yadda ya dace, aka tantance su, aka tsara su, sannan kuma aka gwada ingancinsu —
AI zai fi iya fahimtar yadda ake magana a rayuwa ta gaske.
Ba kamar da ba, inda AI ke rikicewa da bambancin lafazi da yadda mutane ke furta kalmomi.
Yanzu zai fi:
gane magana daidai
fahimtar bambance-bambance
kuma bada sakamako mai inganci
A takaice:
Idan bayanai sun zama na gaskiya kuma an tsara su da kyau, AI zai fahimci ainihin yadda mutane ke magana a duniya a gaske.

Indonesia
