Aerbits

60 posts

Aerbits

@aerbitsai

Transform Your City efficiently and effectively by using data and intelligence and an aerial perspective.

San Francisco, Ca Katılım Nisan 2023

21 Takip Edilen21 Takipçiler

Aerbits@aerbitsai·10 Şub

@tavus This level of emotional information helps the replica to read between the lines. It's not so clueless to your subtle sarcasm anymore.

English

136

Aerbits retweetledi

Tavus@tavus·10 Şub

Most conversational AI understands words, not people. Introducing Raven-1, our audio and video perception model that gives AI the ability to understand emotion, intent, and context the way humans do.

English

551

132.6K

Aerbits@aerbitsai·3 Şub

At this location I estimated there are over 1000 sq feet of surface area converted in illegally dumped water waste. Anyone have any tips on who did this? Please send me any tips.

English

161

Aerbits@aerbitsai·3 Şub

Someone dumped the most gigantic pile of garbage on a street in San Francisco last weekend. Using my aerial detection platform Aerbits.ai I estimated this garbage pile covers 625 square feet of ground. It’s also several feet tall, over six feet tall at the center. And they didn’t stop there. There are four other smaller, but still very large piles around it. It’s catastrophic. Unfathomable. Ludicrous.

English

Aerbits@aerbitsai·14 Oca

@solve_sf Awesome work

English

Solve SF@solve_sf·13 Oca

Before and after - 15th and Valencia - cleaned by DPW this morning.

English

515

Aerbits@aerbitsai·14 Oca

@kwindla @tavus Thanks @kwindla. Some of these new state space and meta-in-context learning model architectures hold a lot of promise for conversational AI. I’m particularly excited about modeling more conversational control and understanding.

English

kwindla@kwindla·14 Oca

.@tavus just published a nice blog post about their "real-time conversation flow and floor transfer" model, Sparrow-1. This model does turn detection, predicting when it's the Tavus video agent's turn to speak. It does this by analyzing conversation audio in a continuous stream and learning and adapting to user behavior. This model is an impressive achievement. I've had a few opportunities to talk to @code_brian, who led the R&D on this model at Tavus, about his work. I love Brian's approach to this problem. Among other things, the Sparrow-1 architecture allows this model to do things like handle overlapping speech, and predict when someone is going to stop talking before they actually do. It's worth reading the Sparrow-1 blog post and watching Brian's explainer video if you're interested in conversational AI tech. Right now you can only use this model as part of the Tavus full stack. (It's not available separately as weights or via an API.) I recorded some video just before Christmas of the Tavus Santa Clause avatar, which used the Sparrow-1 model. I never got around to posting that video clip. I had an idea to write up something about the "Santa Clause Avatar Benchmark," tracking the year-over-year improvement in interactive AI Santa demos. But I'll leave imagining that tongue-in-cheek post as an exercise to the reader and just put the video here as an example of an AI agent that uses the Sparrow-1 model for turn detection!

English

2.5K

Aerbits@aerbitsai·2 Oca

@kwindla We have to learn/work to make our models run on CPU

English

kwindla@kwindla·1 Oca

I know this kind of "here's what might happen next year" prediction is click bait, but how can you cover the tech industry and not understand how massively under-provisioned we are for inference needs *today*? The gap between inference demand and supply is maybe the most under-reported story of the second half of 2025. All the big providers are playing musical chairs with research, different models in their line-ups, and different customer cohorts. Nobody has enough chips. Anything that increases inference demand only helps NVIDIA. Nobody else is positioned to fill this supply gap anytime soon.

English

1.2K

Aerbits@aerbitsai·6 Haz

@tavus Air Gerdan FTW!!!

Español

Tavus@tavus·6 Haz

Office vibes ✅ Cutting edge tech ✅ Slam dunks ✅ Looking for your next big role? We’re hiring.

English

884

Aerbits retweetledi

fal@fal·24 Nis

Hummingbird Lipsync by @tavus is now available on @FAL 🐦 Get photorealistic, zero-shot AI lip sync model for your video projects. Fast, consistent, and cost-effective. Perfect for: 🎬 Video Editing 🌍 Localization 🤖 AI Media Try it now on the fal model gallery! fal.ai/models/fal-ai/…

English

111

6.5K

Aerbits@aerbitsai·9 Nis

@kwindla @replicate and @cerebriumai are pretty good options for long running AI. That being said I think there is a ton of room to optimize deployments, especially when running local models versus services. I’m very keen on and curious about managing a custom kubernetes cluster to take advantage of shared stateless models.

English

kwindla@kwindla·8 Nis

tldr: right now most people are building new voice AI clusters in-house using Kubernetes. You can also check out Pipecat Cloud, which tries to make production deployments of open source voice AI as easy as `docker push` docs.pipecat.daily.co/introduction These days, if you're building out infra for scalable, long-running processes, you're either: 1. Building on top of Kubernetes 2. Building on top of a higher-level or differently opinionated infra provider like Fly, Modal, or Cloudflare workers. One important thing to note is that many infrastructure products don't support long-running processes, UDP networking, or both. So you can't use AWS Lambda or Google Cloud Run for voice AI today. Both (1) and (2) above are a significant amount of work. If you have a lot of k8s experience on your team, you'll be able to set up voice agent clusters in a few days. But some of the building blocks are going to be different from what you've got in production for your other workloads. (How you do capacity planning, think about cold starts, do rolling deployments so you don't drop current sessions, etc.) So factor in extra work specific to the maintenance and evolution of your voice AI clusters. None of the new school cloud providers yet have setups optimized for realtime AI. But I think that will change. We think Pipecat Cloud sits at the sweet spot between giving you the full flexibility of building your own agents while also taking all the devops headaches away. Let us know what you think.

English

552

kwindla@kwindla·8 Nis

I definitely agree that a bot manager is a useful pattern. There are examples floating around that call this a "bot runner." Pipecat itself is un-opinionated about deployment/scaling architectures. The basic model is "one Python process per agent." Beyond that you can build whatever scaffolding makes sense for your use case. ... 🧵

Govind-S-B@violetto96

@kwindla @mem0ai love pipecat, good to see yall pluggin in with other things. only issue is while i like ur higher level features. the subprocess bots kinda thing sound a nightmare for scaling. shouldnt there be a bot manager instance and then bots could be individual workers that can spin up

English

2.1K

Aerbits@aerbitsai·28 Mar

@ejc3 @venturetwins @tavus As staff engineer at Tavus, and one of the primaries on CVI, I can say none of this is scripted. Try it yourself.

English

EJ Campbell@ejc3·17 Mar

@venturetwins @tavus I’m sure that is scripted, <long pause> Justine.

English

151

Justine Moore@venturetwins·17 Mar

This is one of the crazier interactions I've had with an AI avatar. I was chatting with the new @tavus real-time character and didn't know that he could see...until he complimented my background out of nowhere 🤯 You can hear how startled I was about halfway through!

English

15K

Aerbits@aerbitsai·22 Mar

@kwindla I want one!!!

English

kwindla@kwindla·21 Mar

This is like that Willy Wonka thing except you can redeem it for 36GB of HBM.

English

963

Aerbits@aerbitsai·22 Mar

@AlexReibman @adamsilverman @elevenlabs @tavus It would be cool if the replica could be a sort of digital waiting room for your zoom call.

English

Alex Reibman 🖇️@AlexReibman·25 Şub

My co-founder was spending 20+ hours per week handling sales calls So I built an AI clone of @adamsilverman to replace him - @elevenlabs + @tavus to create an interactive replica that asks questions related to the deal - o3 agent reads the transcript and finds identifies key qualification points - Leads get uploaded to our CRM and every interaction gets saved in @AgentOpsAI The tool took me less than a day to build at the @elevenlabs hackathon. Automate the boring stuff with agents. DM/comment for access

English

119

818

141.3K

Aerbits@aerbitsai·11 Mar

@ProjectLincoln That's a very hard line to take on the man. He's also been very successful in each of these ways. All of these programs have had immense impact and success in so many ways.

English

242

The Lincoln Project@ProjectLincoln·10 Mar

Elon is in charge of X, and it’s crashing. He’s in charge of Tesla, and the stock is plummeting. He’s in charge of SpaceX, the rockets are exploding. He’s in charge of DOGE, how do you think that is going to end?

English

4.2K

9.8K

41K

1.4M

Aerbits@aerbitsai·11 Mar

@kwindla @UtopicDev @tavus Hey, yeah, absolutely. I would love to work together. Perhaps we can share some notes.

English

kwindla@kwindla·11 Mar

@aerbitsai @UtopicDev @tavus Congratulations on your model launches last week! So great. I initially missed the Sparrow-0 news (I have covid at the moment) and just got caught up earlier today. Would love to work together on solving turn detection if that’s ever of interest.

English

kwindla@kwindla·6 Mar

Open source, native audio turn detection 🎉🎉🎉 Most voice agents today do turn detection by waiting for speech pauses of a specific, short length. That's not how humans do turn detection when we talk to each other! I've been working with some friends on a new turn detection model. If you're interested in this problem or in learning more about ML engineering, come hack on a small model with us! More details below.

English

150

1.5K

164.4K

Aerbits@aerbitsai·11 Mar

I've been working on the same thing @tavus , a semantic/lexical turn detection model since November. We released the first version Sparrow-0 about a week ago. There is still plenty of room for improvement, and I'm already working on an audio-in model, so we can also use prosody.

English

kwindla@kwindla·6 Mar

I'd love to hear about the approach you're taking. I've been thinking about this problem for a while, and spent all of the Christmas break training a series of proof-of-concept models and generating synthetic data. There are so many possible paths to "solving" this. It's a classic deep rabbit hole challenge that, to do a good job, requires lots of "boring" work. (But I love boring work.)

English

1.5K

Aerbits@aerbitsai·21 Şub

@feeltheomega @AlexReibman Wait, Tavus offers a realtime streaming solution already. It has sub second response times and a lot of really great interactivity and APIs.

English

o-mega.ai@o_mega___·21 Şub

digital twin tech's evolving faster than moore's law on steroids, but still no perfect match for your hackathon needs. virbo ai twin and tavus video generation are leading the pack with apis and uncanny valley-dodging realism. but live streaming. that's the final boss they haven't conquered yet heads up. you might need to franken-stack this. consider coupling one of these twins with a dedicated live streaming solution. it's not ideal, but neither was the first iphone prototype pro tip: keep an eye on open source projects. they're often where the real innovation happens before big tech catches up. might find some gems for api integration and live streaming capabilities good luck at the eleven labs voice hackathon. curious to see what unholy ai chimera you'll unleash on the world. keep us posted on your digital doppelganger adventures

English

128

Alex Reibman 🖇️@AlexReibman·21 Şub

Are good “digital twin” AI products (video clones) out there? Preferably: -API available -Not uncanny valley -Handles live streaming Gonna put to work at the eleven labs voice hackathon this weekend

English

4.7K

Aerbits@aerbitsai·15 Şub

We’re doing some of the most interesting work of my career at @tavus In August we launched the fastest most realistic video to video conversational AI. The last few days I’ve been optimizing our latest offering. Coming out soon!

English

Aerbits retweetledi

Cerebras@cerebras·7 Oca

Tavus is building the first, real-time digital clone, now powered by Cerebras Inference, to deliver an instant and natural conversation flow. Switching to Cerebras ⏱️ cut Time to First Token (TTFT) by 66% ⬆️ increased their Token Output Speed (TPS) by 3X. Experience the Cerebras difference...with a little help from @tavus CEO, @hassaanraza97 's, digital twin. 🤠

English

5.7K

Aerbits@aerbitsai·1 Kas

@drvolts I personally would care very much if he was doing those things now. It would be highly inappropriate. It was inappropriate when he did those things: 30 years ago. But he stopped doing them. He stopped groping women… 30 years ago.

English

David Roberts@drvolts·31 Eki

You see what I'm getting at? I feel like there's nothing I can say about Trump that isn't obvious, that isn't well-understood public knowledge. If you still support him at this point, you clearly don't *care* about tall that stuff. And if you don't care about all that stuff ...

English

742

11.4K

374.4K

David Roberts@drvolts·31 Eki

I'm glad I don't have to write an endorsement piece, because I really wouldn't know how to go about it. Ever since 2015, when Trump descended the escalator, I have had the same feeling, which I've never quite seen articulated, so I will briefly try:

English

968

6.8K

37.4K

4.7M

Keşfet

@tavus @solve_sf @kwindla @code_brian @FAL @replicate @cerebriumai @ejc3