Stephan

15 posts

Stephan banner
Stephan

Stephan

@stephangazarov

19, prev engineering @Kaspersky, 🇷🇺🇺🇸

SF Katılım Mart 2020
255 Takip Edilen37 Takipçiler
Sabitlenmiş Tweet
Stephan
Stephan@stephangazarov·
Before, coding agents had no way to actually verify if the voice agents they ship work. I built a CLI that lets your agent place real calls to voice agents on Vapi, LiveKit, and other platforms so they can autonomously fix broken logic with better context. > Runs on Haiku 4.5 and speaks via Deepgram (configurable persona + 7 languages) > Forwards every event to stdout - STT transcript, tool calls, transfers, costs, provider warnings > Measures call from inside - mouth-to-ear latency at p50-p99, per-turn TTS/STT/LLM, audio quality (clipping, SNR, drops) All open source. npx vent-hq@latest init - no setup needed. Your agent auto-authenticates and generates an access token.
English
2
1
6
773
Xander Cogan
Xander Cogan@xandercogan·
“Everyone had a plan until they get punched in the face.” Throwback to my fight a few years ago. Sometimes, it’s as easy as just not giving up.
Xander Cogan tweet mediaXander Cogan tweet mediaXander Cogan tweet media
English
1
0
3
56
Stephan
Stephan@stephangazarov·
@ArtemKozlovets 50/50, a lot of bugs with audio buffers in turn taking specifically - each platform has a different way of connecting and managing audio flows
English
0
0
0
12
Stephan
Stephan@stephangazarov·
Before, coding agents had no way to actually verify if the voice agents they ship work. I built a CLI that lets your agent place real calls to voice agents on Vapi, LiveKit, and other platforms so they can autonomously fix broken logic with better context. > Runs on Haiku 4.5 and speaks via Deepgram (configurable persona + 7 languages) > Forwards every event to stdout - STT transcript, tool calls, transfers, costs, provider warnings > Measures call from inside - mouth-to-ear latency at p50-p99, per-turn TTS/STT/LLM, audio quality (clipping, SNR, drops) All open source. npx vent-hq@latest init - no setup needed. Your agent auto-authenticates and generates an access token.
English
2
1
6
773
Artem K
Artem K@ArtemKozlovets·
@stephangazarov looks interesting. does it actually runs an audio during the call?
English
1
0
1
23
Xander Cogan
Xander Cogan@xandercogan·
These people can’t be serious
Xander Cogan tweet media
English
1
0
2
56
Stephan
Stephan@stephangazarov·
@charlieholtz Might be worth building a browser inside Conductor so it’d be easier to work on web projects (just like Cursor has it)
English
0
0
5
330
Stephan
Stephan@stephangazarov·
Fix I shipped for a 6-month-old onnxruntime crash in @livekit agents silero VAD plugin just got merged. Feels nice.
English
2
0
4
169
Stephan
Stephan@stephangazarov·
Your coding agent places a call, reads back the full trace, patches the agent, and calls again. It keeps looping until the voice agent passes (default behavior, can opt out). All runs are also auto-persisted locally, so your agent can diff them whenever it needs to check for regressions.
English
0
0
3
155
Stephan
Stephan@stephangazarov·
A bit of engineering context. The caller has to behave like a real user, or the coding agent ends up fixing bugs that don’t matter. Per turn, Haiku picks one of four decision modes: continue, wait, close, or hang up. On the listening side, Vent’s own VAD detects when the agent stops speaking. It’s a vendored TEN VAD compiled to WebAssembly, in-process, no external service. Two filters in front: it ignores quiet background noise, and it waits for two consecutive voice frames before deciding it’s speech (a single noise blip can’t fool it). The silence threshold isn’t fixed. It adapts per turn (200–3000ms) based on how the agent responds in order to not cut the response mid sentence or inflate latency with silence.
English
1
0
2
217