dsa

8.7K posts

dsa

dsa

@dsa

early yc, early twitter, early 23andme, late bloomer @livekit

future Sumali Ağustos 2007
327 Sinusundan10.1K Mga Tagasunod
Naka-pin na Tweet
dsa
dsa@dsa·
Today is a day I’ll never forget. I grew up in Cupertino. My dad was a tech founder in the 80s/90s. I was in YC S07. LiveKit is my 5th company. The first 4 didn’t work out. I’ve had a lot of advantages — it still took 20 years to get here. Founders: keep taking shots.
LiveKit@livekit

We learn to speak before we learn to read. Voice is the most natural interface we have. We just raised a $100M to make building voice AI as easy as a web app.

English
81
38
930
92.3K
dsa nag-retweet
LiveKit
LiveKit@livekit·
xAI STT is live. You can now run a complete cascaded voice agent pipeline on xAI (STT + Grok + TTS) through LiveKit Inference with one API key, giving you more control, full visibility, and easy component swaps.
English
11
11
84
8.3K
dsa nag-retweet
LemonSlice
LemonSlice@LemonSliceAI·
We did it! We built the fastest interactive avatar model Introducing LemonSlice-2.1 𝘍𝘭𝘢𝘴𝘩 ⚡ Here’s how we did it using @modal and @livekit 👇 (note: it was not easy)
English
90
59
443
225.4K
Jack
Jack@jackndwyer·
@livekit Will never debug any other way again
English
1
0
4
149
dsa nag-retweet
LiveKit
LiveKit@livekit·
We shipped Agent Console, a realtime debugging surface for voice agents. Talk to your agent and see the entire pipeline live, from audio and latency to tool calls, transcripts, and participant state. Available now in the LiveKit Cloud dashboard.
English
11
14
72
11.4K
dsa
dsa@dsa·
@SteveAldrin_ @livekit this is cool, i’m gonna share with the team — maybe smth like this can be folded into the framework
English
1
0
1
140
steve aldrin
steve aldrin@SteveAldrin_·
Voice agents don't acknowledge you while you talk. They just wait. Humans don't. So we built real-time back channeling for voice AI. Check this out 👇 Built with @livekit
English
2
1
8
458
LiveKit
LiveKit@livekit·
Pronunciation is one of the fastest ways to break trust in a voice agent, especially in healthcare, legal, and finance where terminology matters. Rime's Mist v3 introduces phonetic brackets that let you define the exact pronunciation for any word and reproduce it deterministically. We built a demo nurse agent that stumbled on words like "levothyroxine" and "gastroesophageal," then fixed every one with a few lines of config. It's also fast.. as low as 100ms TTFB. Try it on LiveKit Inference today.
English
4
13
100
6.6K
dsa
dsa@dsa·
Today @livekit launched Data Tracks. Physical AI and robotics applications need low-latency, realtime transport for data beyond just audio and video. Data tracks let you transmit binary frames from any source: IMUs, LiDAR, RGBD cameras, control systems with no codec overhead and the same low-latency semantics as media. They support full end-to-end encryption and every frame includes a timestamp, so you can easily align data from different sensors. Excited to see what folks build with this! youtube.com/watch?v=Ju9Jz0…
YouTube video
YouTube
English
1
2
8
6.4K
dsa nag-retweet
Daytona
Daytona@daytonaio·
The cloud was not built for AI agents. Recently at @daytonaio Compute Conference, @dsa, co-founder & CEO of @livekit, sat down with @mattturck, VC at @FirstMarkCap to break down why stateful, long-running agent sessions cannot be deployed and scaled the same way as traditional web applications.
English
2
28
35
4.6K
dsa nag-retweet
LiveKit
LiveKit@livekit·
Gemini 3.1 Flash Live just dropped and it's available with LiveKit today. This is the first Gemini 3 native audio model on the Live API. Better instruction following, improved tool calling, reduced speaker drift, and support for 70+ languages. Audio in, audio out. No text conversion in between.
English
19
30
287
39.9K
dsa
dsa@dsa·
Binh is building FSD for toy cars
Binh Pham@pham_blnh

day 2 of building a self-driving power wheel today i officially trained a self driving model from scratch and deploy it on the car by just simply brute forcing everything, I: > made a remote tele-op and remote data collection app built on @livekit infra > feat: 60ms e2e latency between the car and inference compute (car and compute in vietnam with singapore sfu) > feat: data is collected on operator side, baking latency into observation space itself (I expect this made the model more robust against latency) > recorded 30 min of data at 30fps and converted the dataset to lerobot (you can check a sample here) > trained a simple ACT model (3 epoch, batch size 8) to drive the car around my house > deployed the model on the car with remote inference the video explains everything shortly reflection: > the model is ofc bad, idt behavior cloning would work at all for such complex task on such small sample size > it did work in some cases where the observation is well within distribution, even generalizes to back the car when it gets stuck up next: > will hack alpamayo (@nvidia) or @comma_ai ’s e2e to somehow fit this > or train with a llm backbone or a locomotion prior to see if it generalizes

English
0
0
3
574
dsa nag-retweet
Binh Pham
Binh Pham@pham_blnh·
was gonna do a small data collection and training run today but thought what if i give the raw controls to an agent first (@livekit agent and @Google gemini) did not disappoint lol, it can actually navigate to objects around my room
English
2
2
25
1.3K
dsa nag-retweet
LiveKit
LiveKit@livekit·
How can a voice agent tell when you’re actually interrupting it? VAD is too sensitive—laughs, “mm-hmm,” or a sneeze shouldn’t stop the agent. We trained an audio model for adaptive interruption handling so agents can distinguish real interruptions from noise.
English
30
59
615
43.5K
dsa nag-retweet
LiveKit
LiveKit@livekit·
Grok's Text to Speech API is now available in LiveKit Inference. Natural, expressive voices with low-latency streaming. Multilingual in 20+ languages. Telephony and production-ready out of the box. One API key. No extra setup. → docs.livekit.io/agents/models/…
xAI@xai

Grok's Text to Speech API is now available. Start building with natural voices and expressive controls to bring your apps to life. #text-to-speech" target="_blank" rel="nofollow noopener">x.ai/api/voice#text

English
71
92
590
147.5K
dsa nag-retweet
Tony Zhao
Tony Zhao@tonyzzhao·
We raised $165M at a $1.15B valuation to stop doing demos. 2026 is about 1) deployment and 2) research. We will start shipping Memo with our new frontier models in a few months. Our series-B is led by Coatue, with Thomas Laffont joining the board. ->🧵
English
116
102
1.5K
375.7K
dsa
dsa@dsa·
jarvis make me dimsum 👀
English
0
0
0
10
Gavin Ching
Gavin Ching@gching·
if ur not talking to Codex with voice in 2026 ur ngmi😌 voice + agents is the future thx for the inspo @dsa
English
1
0
2
605
dsa nag-retweet
LiveKit
LiveKit@livekit·
LiveKit turns 5 today. What began as an open source project now powers 300k+ developers, 5k+ customers, and billions of calls across voice, video, and physical AI agents. Next: building the infrastructure for voice-driven computing. Thank you to our community for 5 incredible years.
English
5
14
88
6.1K
dsa nag-retweet
LiveKit
LiveKit@livekit·
We shipped the tutorial for Agents UI. In 5 minutes you'll have a fully wired voice agent frontend with audio visualizers, media controls, and session management built directly into your codebase. Watch it, build it, own it. shadcn inside™.
LiveKit@livekit

Introducing Agents UI, an open-source @shadcn component library for building polished React frontends for your voice agents. Audio visualizers. Media controls. Session management tools. Chat transcripts. All wired to LiveKit Agents. Install via the shadcn CLI and own the code.

English
7
15
189
20.3K