Siddharth Pradhan
62 posts


Building real-time voice apps is humbling the networking layer is just as tricky as the AI parts.
There's still a lot I want to understand about how WebRTC actually works under the hood. Will share what I find 👀
#WebRTC #VoiceAI #BuildInPublic #Networking #LearningInPublic
English

There was a second issue hiding behind the first one too.
Browsers block microphone and camera access on any page that isn't HTTPS (or localhost). I was on plain HTTP from the VM's IP so the mic was completely locked out. No prompt, no warning.
#WebRTC #VoiceAI #Networking
English

Spent way too long debugging this today
I was building a voice agent using #Pipecat and SmallWebRTC on a remote VM. Tried to access it the usual way SSH port forwarding from my IDE.
The connection never established. No error. Just silence.
#WebRTC #VoiceAI #Pipecat #Networking
English

Building this requires balancing concurrency, audio buffers, and network hops all at once.
It’s a reminder that great voice UX is about managing the space between sound and silence. Pipecat handles this beautifully.
#VoiceAI #SoftwareEngineering #Pipecat
English

The magic is in the pipeline:
• Streaming audio frames
• Tuning VAD for precision timing
• Handling interruptions
• Reducing server & network latency
It’s about managing the flow of sound and silence.
#VoiceAI #SoftwareEngineering
English

Real-time voice AI is a systems engineering challenge.
Diving into #Pipecat has highlighted how much work goes into the orchestration of a live conversation. You’re essentially building a real-time #distributedSystem where every millisecond counts.
#VoiceAI #SoftwareEngineering
English

@alexocheema @exolabs I am telling you everyone,
Privacy is MOAT in the upcoming years and on device LLMs and stuff like exolabs will BLOOM!
English

China is way ahead on AI adoption.
A school in Beijing has repurposed old macs to run personalised AI agents 100% locally using @exolabs
The macs were previously used in their film studies lab, for video editing.
They have ingested their entire corpus of school data: curriculums, reports, instructional materials and learning objectives - so it’s grounded to all their data in realtime.
In order to get accurate answers, they need frontier models, which are BIG - memory is the constraint (not FLOPS). Apple devices with unified memory have a lot of high speed memory so stacking enough of them makes it possible to run massive models.
A big concern of schools and parents is data privacy - when students or teachers use models in the cloud they are sending all their data in plaintext to the model provider. Even if schools have policies around this there’s always the risk someone accidentally copy-pastes sensitive data into the model - data leakage is inevitable.

English

@sandislonjsak @Abhinavstwt Clerk was using GCP and went down for an hour, still no one noticed
English

This Resume has an ATS score of more than 92🤯
This Resume helped many in getting an interview calls from companies like Google, Microsoft, Amazon, and many more.
I have personally used this single-column resume in my job hunting and got amazing results
I am sharing the exact similar editable ATS Friendly Resumes templates!
To get it:
1. Follow me
(So that I can DM)
2. Like & Repost
3. Reply "Resume"
Follow me so I will dm immediately 💯

English

Follow-up to my past prisma adventures
problem solved (sort of)
Deleted all migration history
made a fresh base migration
manually marked everything as applied Prisma’s happy again. But is there no safe way to do this?
#Prisma #PostgreSQL #Database #ORM
English

@ptsi @kylejeong Cheap?? Gemini computer-use is expensive
English

The new Computer-Use model from @GoogleDeepMind has been crushing it in trusted benchmarks,
We benchmarked the model against the new models from @AnthropicAI as well as @OpenAI on Online Mind2Web & Web Voyager.
Gemini crushes them all in accuracy, speed, and cost (by a lot).

English






