Hume AI (@hume_ai) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Hume AI@hume_ai·10 Mar

Today we're releasing our first open source TTS model, TADA! TADA (Text Audio Dual Alignment) is a speech-language model that generates text and audio in one synchronized stream to reduce token-level hallucinations and improve latency. This means: → Zero content hallucinations across 1,000+ test samples → 5x faster than similar-grade LLM-based TTS → Fits much longer audio: 2,048 tokens cover ~700 seconds with TADA vs. ~70 seconds in conventional systems → Free transcript alongside audio with no added latency

English

99

314

2.9K

257.5K

Hume AI@hume_ai·13 Mar

@fffiloni @huggingface @Gradio Great work!

English

0

109

Hume AI retweetledi

Sylvain Filoni@fffiloni·12 Mar

I made a @huggingface Space @Gradio demo for TADA to make the paper’s workflow easier to explore. The original demo was a bit confusing, so this one is more guided and helps you understand what’s really going on — and in what order the pipeline is supposed to work.

Hume AI@hume_ai

Today we're releasing our first open source TTS model, TADA! TADA (Text Audio Dual Alignment) is a speech-language model that generates text and audio in one synchronized stream to reduce token-level hallucinations and improve latency. This means: → Zero content hallucinations across 1,000+ test samples → 5x faster than similar-grade LLM-based TTS → Fits much longer audio: 2,048 tokens cover ~700 seconds with TADA vs. ~70 seconds in conventional systems → Free transcript alongside audio with no added latency

English

3

6

53

6.6K

Hume AI@hume_ai·13 Mar

@wuzhu_ @huggingface @grok It is not fine-tuned for languages outside of English, though it has some multilingual capabilities!

English

0

14

吴兢 | AIxDesign@wuzhu_·10 Mar

@hume_ai @huggingface @grok 这个项目支持中文？

中文

2

0

244

Hume AI@hume_ai·10 Mar

Today we're releasing our first open source TTS model, TADA! TADA (Text Audio Dual Alignment) is a speech-language model that generates text and audio in one synchronized stream to reduce token-level hallucinations and improve latency. This means: → Zero content hallucinations across 1,000+ test samples → 5x faster than similar-grade LLM-based TTS → Fits much longer audio: 2,048 tokens cover ~700 seconds with TADA vs. ~70 seconds in conventional systems → Free transcript alongside audio with no added latency

English

99

314

2.9K

257.5K

Hume AI@hume_ai·13 Mar

@CastrikoEdit It is not fine-tuned for languages outside of English, though it has some multilingual capabilities!

English

0

161

Castriko@CastrikoEdit·11 Mar

@hume_ai I installed it locally on my PC to test it with my 4090. It uses almost all of my 24GB of RAM, and it takes nearly 300 seconds to generate this text with a voice that speaks quickly, skipping some words. In Spanish, the result is even worse, as it sounds robotic.

English

4

0

2

617

Hume AI@hume_ai·13 Mar

@overbeight Please try again and let us know if you're still seeing this!

English

0

302

overbait@overbeight·10 Mar

@hume_ai First try - Error Second try - you have exceeded your quota. See you in 24hr!👏

English

2

0

47

2.7K

Hume AI@hume_ai·13 Mar

@_NeutralFan_ It is not fine-tuned for languages outside of English, though it has some multilingual capabilities!

English

0

191

Neutral Fan 𖣘@_NeutralFan_·10 Mar

@hume_ai This is really good, any plans to support multiple languages or only English?

English

2

0

2.3K

Hume AI@hume_ai·13 Mar

@rajbreno It is not fine-tuned for languages outside of English, though it has some capabilities in other languages.

English

0

1

315

Raj Breno@rajbreno·10 Mar

@hume_ai Is it multilingual?

English

4

0

11

5.5K

Hume AI@hume_ai·13 Mar

@brubbleR Try again and let us know if you're running into issues!

English

1

0

143

brubble@brubbleR·11 Mar

@hume_ai its not working in the hf space.

English

1

0

239

Hume AI@hume_ai·13 Mar

@0xecall Please try again and let us know if you're still running into this issue!

English

0

213

ECALL@0xecall·11 Mar

@hume_ai I would love to try this but unfortunately your space isnt't spacing

English

1

0

3

864

Hume AI@hume_ai·10 Mar

Read our blog: hume.ai/blog/opensourc…

English

1

5

57

10.7K

Hume AI@hume_ai·10 Mar

Try the model: huggingface.co/collections/Hu…

English

3

14

158

36.5K

Hume AI@hume_ai·22 Oca

Today, we're announcing significant new traction in our voice AI research and development partnerships. Over the past few months, we’ve quietly been expanding our focus on providing frontier voice AI training data to research labs looking to imbue emotion understanding and cutting edge expressivity into their foundation models. We believe some of the most exciting possibilities for deeper speech and emotion understanding will come to fruition this year. By training today’s frontier models to understand the nuances voice interaction—rife with subtle tones of frustration or satisfaction, “aha” moments, chuckles, sighs, backchannels, and interruptions—we believe that labs will unlock new possibilities for voice AI to become a primary interface within many applications. In our experience working with leading research labs and AI-first enterprises, a consistent pattern we’ve seen is a greater need for high-quality datasets and evaluation pipelines than for new algorithms or architectures. Researchers spend up to 80% of their time curating the data they need to diagnose model issues, fix failure modes, and improve model behavior. That’s why Hume is now focused on building the data and evaluation infrastructure needed to train next-generation voice models across industry. With the right training, we hope that deeper voice understanding and empathy can be translated not just into more efficient interfaces but into better alignment of AI with human well-being. Read more below hume.ai/blog/data-blog…

English

11

143

12.1K

Hume AI@hume_ai·16 Ara

Can you convince Santa to take a villain off the naughty list? Visit 12DaysofAI.app for a whimsical voice AI puzzle every day leading up to Christmas 🎅 All made using the EVI API, the most expressive conversational agent platform in the industry. How about build your own challenge, jolly or otherwise, at app.hume.ai!

English

1

16

2.5K

Hume AI@hume_ai·25 Kas

Python example on GitHub: github.com/HumeAI/hume-ap… Twilio webhook docs: dev.hume.ai/docs/integrati… Control plane docs: dev.hume.ai/docs/speech-to…

English

0

2

1.4K

Hume AI@hume_ai·25 Kas

The Control Plane API is now live for Hume's EVI (Empathic Voice Interface) — enabling fast & seamless creation of phone-based voice agents. Through the API, you can start a parallel connection to observe, modify settings, or send messages to the EVI chat. See the video below for a demo & tutorial:

English

3

2

17

2.4K

Hume AI@hume_ai·20 Kas

Today we honor Dr. Paul Ekman, whose pioneering work on how we express and perceive emotion laid much of the groundwork our field stands on today. His ideas still shape how we try to understand human feeling with nuance and respect. May he rest in peace. paulekman.com/blog/the-passi…

English

1

2

21

2.3K

Hume AI@hume_ai·6 Kas

Get started today: platform.hume.ai/voice-conversi…

English

1

0

12

1.6K

Hume AI@hume_ai·6 Kas

One performance, infinite voices. Voice Conversion is now live on Hume’s creator studio and API! Generate the same pacing, pronunciation, and intonation with one recording across any voice you choose. Hear it for yourself ⬇️

English

20

33

166

19.5K

Hume AI@hume_ai·30 Eki

The key to handling these edge cases is contextual awareness. Octave 2, a speech-LLM that understands language context, formality, and other factors, is the first speech-language model that can read phone numbers reliably. See for yourself at platform.hume.ai

English

0

4

795

Hume AI@hume_ai·30 Eki

Phone numbers can be read aloud in different ways. For instance, 555-010-2300 can be read as “five five five, zero one zero, two three zero zero” or as “five five five, oh one oh, two three hundred.” This ambiguity in the training data confuses AI models if they aren’t trained the right way. However, conveying data like phone numbers and emails is essential in most production-ready use cases, from customer service to digital health concierges to sales calls. How can we teach AI to resolve the ambiguities?

English

1

0

5

880

Hume AI

Keşfet