Matt Makai | Full Stack Python | Plushcap: "Added a new pre-seed YC company, @usemoss to Plushcap: plushcap.com/moss. H/t to"

Matt Makai | Full Stack Python | Plushcap@fullstackpython·20 Nis

Added a new pre-seed YC company, @usemoss to Plushcap: plushcap.com/moss. H/t to @diabhey for telling me about them. They’re working on pushing down the latency threshold on retrieval which is a blocker for conversational agents, especially Voice AI agents 🧵

Matt Makai | Full Stack Python | Plushcap tweet media

English

904

Matt Makai | Full Stack Python | Plushcap@fullstackpython·20 Nis

As with any good early stage DevTool, most of their content is in their docs docs.moss.dev/docs, with a few recent technical blog posts they wrote on their approach to solving issues in this problem space

English

216

Abhi@diabhey·20 Nis

@fullstackpython @usemoss I've been building with Moss on a Voice AI agent and the speed difference is noticeable in conversation. The gap between "technically fast" and "feels fast enough to not break the flow of speech" is smaller than people think. x.com/diabhey/status…

Abhi@diabhey

I am building a course on shipping production Voice AI agents for a major online education platform. Before I teach it, I want to live on it. So I've been building a voice agent for diabhey.com. Here is my current stack and learnings from v1: • @livekit for realtime transport and the agent framework. Handles sessions, room events, and metrics out of the box. • Silero VAD with min_silence_duration set to 250ms. The plugin default is 550ms. VAD tuning is the single biggest lever on how a voice agent actually feels. 550ms felt sluggish in conversation, 250ms felt natural, but go much lower and you'll cut users off mid-thought. • @DeepgramAI for STT. • @cerebras running Llama 3.1 8B for the LLM. Picked it for raw token throughput. In voice, tokens per second matters more than model size. You're racing a user's attention span, not a benchmark. • @cartesia for TTS. • @usemoss for retrieval. It's an in-process semantic search engine in Rust/WebAssembly, so lookups stay in the agent process with no network hop. If you're shipping voice agents right now, what's moved latency the most for you? Drop it below. I'm collecting real patterns for the course.

English

ashvath@xskvth·20 Nis

@fullstackpython @usemoss @diabhey lets gooo!

English

Tech P@Tech_p001·22 Nis

@fullstackpython @usemoss @diabhey Pure value content

English

Paylaş