aoi
205 posts

aoi
@bigvalleyblue
cofounder @liva_ai (@ycombinator s25) prev @harvard research on continual learning





Smart Turn v3.1. Smart Turn is a completely open source, open data, open training code turn detection model for voice AI, trained on audio data across 23 languages. The model operates on the input audio in a voice agent pipeline. Each time the user pauses briefly, this model runs and returns a binary decision about whether the user has finished speaking or not. The 3.1 release has two big improvements: 1. New data sets for English and Spanish, collected and labeled by contributors Liva AI, Midcentury, and MundoAI. The majority of the training data for the Smart Turn model is synthetically generated. Using synthetic data makes it possible to scale up training for a model like this. We've done a lot of work on the synthetic data pipeline to emulate as much of the natural variability of human speech as possible. But accurately labeled human data is very valuable and has a measurable impact on model quality. The 3.1 training run incorporates three new human data sets. 2. An unquantized, GPU-oriented version of the model alongside the ONNX version intended to run on CPUs. The Smart Turn ONNX quant delivers a result in 12ms on my laptop and 70ms on a typical cloud vCPU. That's fast! Because this is an audio model, you can run it in parallel with transcription and it will generally give you a result before the transcription final chunks are available. But if you have GPUs in your fleet, you can run the model even faster. (Or, more to the point, very scalably.) Inference runs in ~2ms on an NVIDIA L40S. Read the launch blog post if you're interested in more details. And if you're running this model yourself, see notes in the blog post about ONNX runtime optimization. The Smart Turn model is fully integrated into Pipecat, and available in Pipecat Cloud.

Discord has easily been the highest ROI app in my life. I learned minecraft there, got into college, met my cofounder (4 yrs ago), found a customer, hired, all thanks to discord, and now our entire company basically runs on the app lol. Never bet against discord




















