Pavel Shtykovskiy

158 posts

Pavel Shtykovskiy banner
Pavel Shtykovskiy

Pavel Shtykovskiy

@framrus

Particle physics and astrophysics -» predicting ad clicks at Yandex -» spoken language understanding -» LLM-powered NPCs at https://t.co/A3vXLHYYGN

Berlin, Deutschland Присоединился Mayıs 2014
1.4K Подписки88 Подписчики
Pavel Shtykovskiy ретвитнул
Inworld AI
Inworld AI@inworld_ai·
Inworld TTS-1.5 releases today. The #1 TTS on Artificial Analysis now offers realtime latency under 250ms and optimized expression and stability for user engagement, and costs half a cent per minute. Some voice models are fast, some are expressive, some are affordable. We outperform them all across the board. Production-grade realtime latency: <250ms latency for Max model, <130ms for Mini (P90 first audio) - 4x faster than before. Voice agents now respond before users notice any delay. Engagement-optimized quality: 30% more expressive to serve a wider range of personalities and 40% lower word error rates for fewer hallucinations, word cutoffs, and audio artifacts. Built for consumer-scale: Radically affordable with enhanced multilingual support (15 languages including Hindi) and enhanced voice cloning, now via API. On-prem options now available for enterprises.
Inworld AI tweet media
English
55
105
491
285.2K
Pavel Shtykovskiy ретвитнул
Inworld AI
Inworld AI@inworld_ai·
Our TTS Max model just debuted at #1 on the @ArtificialAnlys leaderboard. And at $10/million characters, it’s also the most cost-efficient commercial TTS model available. Excited to keep making state-of-the-art voice more accessible. Check it out at inworld.ai/tts or through our partners @pipecat_ai and @livekit.
Artificial Analysis@ArtificialAnlys

Inworld TTS 1 Max is the new leader on the Artificial Analysis Speech Arena Leaderboard, surpassing MiniMax’s Speech-02 series and OpenAI’s TTS-1 series The Artificial Analysis Speech Arena ranks leading Text to Speech models based on human preferences. In the arena, users compare two pieces of generated speech side by side and select their preferred output without knowing which models created them. The speech arena includes prompts across four real-world categories of prompts: Customer Service, Knowledge Sharing, Digital Assistants, and Entertainment. Inworld TTS 1 Max and Inworld TTS 1 both support 12 languages including English, Spanish, French, Korean, and Chinese, and voice cloning from 2-15 seconds of audio. Inworld TTS 1 processes ~153 characters per second of generation time on average, with the larger model, Inworld TTS 1 Max processing ~69 characters on average. Both models also support voice tags, allowing users to add emotion, delivery style, and non-verbal sounds, such as “whispering”, “cough”, and “surprised”. Both TTS-1 and TTS-1-Max are transformer-based, autoregressive models employing LLaMA-3.2-1B and LLaMA-3.1-8B respectively as their SpeechLM backbones. See the leading models in the Speech Arena, and listen to sample clips below 🎧

English
11
17
140
16.1K
Pavel Shtykovskiy ретвитнул
Chieh-Hsin (Jesse) Lai
Chieh-Hsin (Jesse) Lai@JCJesseLai·
Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core ideas that shaped diffusion modeling and explains how today’s models work, why they work, and where they’re heading. 🧵You’ll find the link and a few highlights in the thread. We’d love to hear your thoughts and join some discussions! ⚡ Stay tuned for our markdown version, where you can drop your comments!
Chieh-Hsin (Jesse) Lai tweet media
English
53
493
2.4K
842K
Pavel Shtykovskiy ретвитнул
Nathan Lambert
Nathan Lambert@natolambert·
Just signed a book deal for The RLHF Book, excited to make improvements to it this fall and get physical copies in your hands soon :) (rlhfbook dot com)
English
31
18
457
47.1K
Pavel Shtykovskiy
Pavel Shtykovskiy@framrus·
@abacaj What happens with eval loss when train loss sharply decreases on the beginning of epochs 2 and 3? I saw multiple times how it jumps up on 2nd/3rd epochs start.. Do people keep training in such cases because it's good for final metrics?
English
0
0
0
318
anton
anton@abacaj·
This is exactly why I don't really mess with PEFT / Lora for fine tuning... even though full fine tune is a longer process gives you a better domain model
anton tweet media
English
8
25
190
53.3K
Pavel Shtykovskiy ретвитнул
Chelsea Finn
Chelsea Finn@chelseabfinn·
Want to learn about meta-learning & few-shot learning? All of the latest lecture videos for Stanford CS330 are now online! youtube.com/playlist?list=… New topics in Fall '22 include: - self-supervised pre-training - large scale meta-optimization - domain adaptation & generalization
English
17
185
918
150.9K
Pavel Shtykovskiy
Pavel Shtykovskiy@framrus·
@yaroslavvb @wtpayne2 @yandex @YandexAI This is not true any more. People are scared and not without a reason. Also what is scary is that gov position has high support in masses due to brain washing (not in Y., mostly among not well educated people).
English
1
0
0
0
Yaroslav Bulatov
Yaroslav Bulatov@yaroslavvb·
Yandex is a key tool in shaping the alternative reality that allows Ukraine war to continue with popular support. Many people are associated with @yandex or @YandexAI and remain silent on the issue. Silence is complicity. meduza.io/news/2022/03/0…
English
6
36
87
0
Pavel Shtykovskiy ретвитнул
Soumith Chintala
Soumith Chintala@soumithchintala·
Fun read on why MLOps is still somewhat broken -- the engineers who build them are not users. In ML Frameworks, the authors were ML scientists -- (Py)Torch, Theano, Caffe, MXNet, Keras, Chainer, TF, etc. and that helped in design requirements accurately being in your head.
Yaroslav Bulatov@yaroslavvb

Bananas and ML infrastructure: I've asked around about cloud workflows, and most of the feedback had unhappiness with cloud tooling. This prompted a discussion in @chipro's MLops community -- why are MLops frameworks so bad? (1/9)

English
10
37
250
0
Pavel Shtykovskiy ретвитнул
Yaroslav Bulatov
Yaroslav Bulatov@yaroslavvb·
Bananas and ML infrastructure: I've asked around about cloud workflows, and most of the feedback had unhappiness with cloud tooling. This prompted a discussion in @chipro's MLops community -- why are MLops frameworks so bad? (1/9)
English
10
62
365
0
Pavel Shtykovskiy ретвитнул
Sheldon Axler
Sheldon Axler@AxlerLinear·
Today the videos that I made to accompany my book Linear Algebra Done Right surpassed two million minutes of total viewing on YouTube. Those videos are freely available from the links at linear.axler.net/LADRvideos.html. #LinearAlgebra
Sheldon Axler tweet media
English
34
496
2.7K
0
Pavel Shtykovskiy ретвитнул
Sebastien Bubeck
Sebastien Bubeck@SebastienBubeck·
Just watched an incredible talk by @AlexGDimakis at the Simons Institute, highly recommended. Their Iterative Layer Optimization technique to solve inverse problems with GANs make a LOT of sense! The empirical results on the famous blurred Obama face speak for themselves! 1/4
Sebastien Bubeck tweet media
English
3
75
444
0