djhardcore

146 posts

djhardcore

@djhardcore007

Voice AI @TechAtBloomberg @NYU_Courant | Advancing real-time TTS/ASR, emotional agents, and open-source speech tech

Katılım Ocak 2020

1.9K Takip Edilen173 Takipçiler

Sabitlenmiş Tweet

djhardcore@djhardcore007·5 Oca

Most voice assistants remain half-duplex: rigid, turn-based, inefficient. Full-duplex ASR enables simultaneous listening and speaking—structurally superior for natural interaction. Core concept explained. #VoiceAI #ASR

English

292

djhardcore@djhardcore007·10 Mar

Seeing k-dense.ai mass-produce ML research scared the shit out of me. I do think that even though AI can write and conduct research far better than most human researchers, research still matters—because reality still resists us. Books still matter because people still want compressed wisdom, perspective, and meaning. So the question becomes less “Why write?” and more “What can I produce that is not just more text?”

English

djhardcore@djhardcore007·3 Mar

Real-time Voice UX Stack •VAD (voice activity detection) •Streaming ASR •Turn classifier •Interrupt handler •TTS with low startup latency •Audio buffering control

English

djhardcore@djhardcore007·3 Mar

Tools to build real-time voice agents: • PIPECAT – github.com/pipecat-ai/pip… • LiveKit – livekit.io • VAPI – vapi.ai • Meta Voice SDK – developers.meta.com/horizon/docume… #voiceagent

English

djhardcore@djhardcore007·2 Şub

How to reduce LLM inference cost: Prompting - Use code (tools / skills) whenever possible - Add a router: default to small, cheap models (e.g. Qwen) - Cache prompts, results, and RAG outputs - Keep context short; use summaries - Limit retries and agent loops If you self-host LLMs - Batch inference and constrain decoding - Quantize models when possible

English

djhardcore@djhardcore007·26 Oca

@openclaw OS-level agency is the future! 1. persistent local memory, not sessions. 2. apps stop being destinations and become workflows executing intent. Agents abstract tools and optimize execution. 3. financial access must be guard-railed via limited, programmable wallets.

English

djhardcore@djhardcore007·16 Oca

Vertical AI / physical AI companies are, in practice, data companies.

Packy McCormick@packyM

Investors are betting billions of dollars that robotics will experience a Giant Leap. Meaning: robots are not useful today, but throw enough GPUs, models, data, and PhDs at the problem, and you’ll cross some threshold on the other side of which you will meet robots that can walk into any room and do whatever they’re told. The Giant Leap view is sexy. It holds the promise of a totally unbounded market – labor today is a ~$25 trillion market, constrained by the cost and unreliability of humans; if robots become cheap, general, and autonomous, the argument goes that you get Jevons Paradox for labor - available to whichever team of geniuses in a garage produces the big breakthrough first. This is the type of innovation that Silicon Valley loves. Brilliant minds love opportunities where success is just a brilliant idea away. My friend @evanbeard is betting that progress will happen by climbing the gradient of variability. That robotics will progress towards general usefulness in small steps. The logic is clear: - Robotics is bottlenecked on data. - The best data is the data your robots collect actually doing things. - The best strategy, then, even if it's not the sexiest, is to get paid to collect that data, learn, and iterate. This is where the vast majority of value lies, and the real path to our abundant robotic future. For the first co-written essay in not boring world, Evan and I write about the robots.

Română

3.6K

djhardcore@djhardcore007·16 Oca

Vibe coding feels amazing right up until you run into maintenance issues. Bills come due.

English

djhardcore@djhardcore007·16 Oca

let go and lets go!

Sahil Bloom@SahilBloom

The ultimate life hack is the ability to quickly reset and recover. From a bad interaction. From a bad day. From a missed workout. From a poor decision. You can start over whenever you want. You can't always control what happened, but you can control how long you carry it.

English

101

djhardcore@djhardcore007·15 Oca

Questioning is thinking. People don’t question because they r lost. People who treat questioning as an attack do so because their certainty is fragile.

English

djhardcore@djhardcore007·14 Oca

For specific domain, like finance, perception matters: 1. Build a hard set (numbers, tickers, acronyms, names) 2. Synthesize all samples 3. Run ASR → WER (overall + hard set) 4. Compute Entity Accuracy (non-negotiable) 5. Human MOS + trust check. 6. Verify RTF, latency, failure rate

English

djhardcore@djhardcore007·14 Oca

Questions to answer: 1. does it sound human? 2. can users trust what they hear? 3. accent/voice/style: does it sound like the right person? 4. prosody diagnostics. mainly for debugging. 5. latency. shipping constraint.

English

137

djhardcore@djhardcore007·14 Oca

Modern TTS is neural sequence modeling: pronunciation, timing, prosody, voice identity. Evaluation = perception + correctness + deployability. Minimal scorecard that matters: 1.MOS 2.WER (overall + hard set) 3.Entity accuracy 4.RTF + latency 5.Failure rate

English

djhardcore@djhardcore007·6 Oca

5/ Combining Skills with MCP? MCP grants standardized real-time access (repos, Slack, databases). Skills dictate precise logic. Integrated: agents process live data via controlled, repeatable frameworks. This is the optimal config. #MCP #ClaudeSkills #AgenticAI

English

djhardcore@djhardcore007·6 Oca

4/ Skills vs. Tool Calls - Tool calls retrieve or act on real-time data. - Skills provide orchestration: deciding when, why, and how those tools are used. Tools handle execution. Skills handle control flow. Production agents need both.

English

djhardcore@djhardcore007·6 Oca

Built @claudeai Agent Skills hands-on. My notes: – What Agent Skills are – How to build them – Skills vs. prompting – Skills vs. tool calls – Using Skills with MCP #ClaudeSkills #AIAgents #BuildAI

English

102

Keşfet

@openclaw @claudeai @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA