
We learn to speak before we learn to read. Voice is the most natural interface we have. We just raised a $100M to make building voice AI as easy as a web app.
dsa
8.7K posts

@dsa
early yc, early twitter, early 23andme, late bloomer @livekit

We learn to speak before we learn to read. Voice is the most natural interface we have. We just raised a $100M to make building voice AI as easy as a web app.

10k stars on livekit/agents We released version 1.0 a year ago. Today, our customers are building agents for healthcare, finance, insurance, education, robotics, and more. It’s been amazing to see our community grow over the past year. Thank you to everyone building with us.








day 2 of building a self-driving power wheel today i officially trained a self driving model from scratch and deploy it on the car by just simply brute forcing everything, I: > made a remote tele-op and remote data collection app built on @livekit infra > feat: 60ms e2e latency between the car and inference compute (car and compute in vietnam with singapore sfu) > feat: data is collected on operator side, baking latency into observation space itself (I expect this made the model more robust against latency) > recorded 30 min of data at 30fps and converted the dataset to lerobot (you can check a sample here) > trained a simple ACT model (3 epoch, batch size 8) to drive the car around my house > deployed the model on the car with remote inference the video explains everything shortly reflection: > the model is ofc bad, idt behavior cloning would work at all for such complex task on such small sample size > it did work in some cases where the observation is well within distribution, even generalizes to back the car when it gets stuck up next: > will hack alpamayo (@nvidia) or @comma_ai ’s e2e to somehow fit this > or train with a llm backbone or a locomotion prior to see if it generalizes

Grok's Text to Speech API is now available. Start building with natural voices and expressive controls to bring your apps to life. #text-to-speech" target="_blank" rel="nofollow noopener">x.ai/api/voice#text…


Introducing Agents UI, an open-source @shadcn component library for building polished React frontends for your voice agents. Audio visualizers. Media controls. Session management tools. Chat transcripts. All wired to LiveKit Agents. Install via the shadcn CLI and own the code.