Kaahan Radia

18 posts

Kaahan Radia

@kradisme

Building something new @keyframelabs. Ex-Zipline.

Los Angeles, CA Katılım Nisan 2020

60 Takip Edilen37 Takipçiler

Kaahan Radia@kradisme·1d

@ycombinator @DatostApp @MaceoCk here we gooooo

English

135

Kaahan Radia retweetledi

Y Combinator@ycombinator·1d

Datost (@datostapp) is an AI data analyst in Slack. It keeps a semantic layer of your business definitions, crm, docs, and codebase so it knows what questions mean. 75.2% on the hardest public text-to-SQL benchmark, where Opus 4.6 scores 33%. Congrats on the launch, @maceock & @jasonhywang! ycombinator.com/launches/Pxg-d…

English

195

133.1K

Kaahan Radia@kradisme·2d

@crystalwizard @pika_labs openclaw integration coming soon 🤫

English

Crystalwizard@crystalwizard·2d

on the page: "as low as 0.06 cents a minute" - rather than 50 cents a minute like @pika_labs and they are GOOD avatars, also unlike @pika_labs

Kaahan Radia@kradisme

@ycombinator @KeyframeLabs @parthnradia Try it live at keyframelabs.com!

English

Kaahan Radia retweetledi

LiveKit@livekit·2d

We built a demo with Keyframe Labs avatars on the LiveKit Agents Framework. The avatar doesn't just lip-sync. It picks up on the emotional context of the conversation, and you can see it in its face when the mood changes. It can also hand off the conversation to a different agent without reconnecting. The new agent fires RPCs to update the UI in real time. LiveKit x Keyframe plugin and sample repo in the thread.

English

Kaahan Radia@kradisme·2d

@crossedxxrabbit @ycombinator @KeyframeLabs @parthnradia if a tree falls in the forest...

English

Crossed Rabbit@crossedxxrabbit·2d

@ycombinator @KeyframeLabs @parthnradia @kradisme The AI now looks like a human on video calls. The human on the other end does not know. The human on the other end may also be AI. Neither has been informed. This is called communication. 📡

English

149

Kaahan Radia retweetledi

Y Combinator@ycombinator·2d

.@KeyframeLabs turns AI into lifelike video calls. Developers and enterprises can add photoreal, conversational humans to AI agents and applications in minutes. Congrats on the launch, @parthnradia & @kradisme! ycombinator.com/launches/Pwg-k…

English

205

31.9K

Kaahan Radia@kradisme·2d

@ycombinator @KeyframeLabs @parthnradia Try it live at keyframelabs.com!

English

467

Kaahan Radia retweetledi

Y Combinator@ycombinator·25 Mar

Minicor (@minicor_) builds self-healing desktop automations for AI companies whose customers run on legacy desktop software with no APIs. Congrats on the launch, @faizchishtie and @sahee_d! ycombinator.com/launches/Pkq-m…

English

160

53.8K

Kaahan Radia@kradisme·20 Şub

@arnie_hacker IMO initial experiments on a small dataset, image it, way easier to debug and iterate. For larger datasets, kinda depends on your video characteristics; getting sampling diversity might require decoding more frames than you think. Preproc + streaming webdataset is our go to.

English

234

Arnie Ramesh@arnie_hacker·19 Şub

Anyone experienced with training video diffusion models? Noob question: Do you pre-process mp4 into individual frames and store before training? Doesn't this blow-up storage requirements? Or do you dynamically convert mp4 into frames during training (how is this parallelized?)

English

13K

Kaahan Radia@kradisme·19 Şub

@sedielem @ThKouz Matches my intuition; contrary to a bunch of papers that came out late 2025, tuning noise scaling still seems to be necessary, even for these newer architectures.

English

Sander Dieleman@sedielem·18 Şub

Neat idea: jointly diffuse pixels and DINO features with separate noise levels. Then optimise the trajectory through 2D noise level space. Could do this with DINO + traditional VAE latents as well to get a souped-up version of ReDi (representationdiffusion.github.io @ThKouz et al.)!

Alan Baade@BaadeAlan

What's the right space to diffuse in: Raw Data or Latents? Why not both! In Latent Forcing, we order a joint diffusion trajectory to reveal Latents before Pixels, leading to improved convergence while being lossless at encoding and end-to-end at inference. w/ @drfeifei+... 1/n

Islington, London 🇬🇧 English

219

13.7K

Kaahan Radia@kradisme·12 Şub

the only thing I’ve learned from this whole ai coding thing is that some people are really, really bad at reviewing PRs

English

Kaahan Radia@kradisme·12 Şub

@tenderizzation cries in muon

Italiano

255

tender@tenderizzation·12 Şub

any optimizer better than adam is 100% a waste of time and if you disagree you're literally just addicted

critter@BecomingCritter

any audiophile equipment better than airpods is 100% a waste of time and if you disagree you're literally just addicted

English

144

10.4K

Kaahan Radia retweetledi

Keyframe Labs@KeyframeLabs·12 Şub

Introducing the world's most expressive, conversational AI humans. Runs in real-time at just $0.06 per minute. Watch Cosmo move fluidly through emotions in an unedited conversation with our CTO.

English

669

Kaahan Radia@kradisme·12 Şub

@matiii Does an incredible job driving @KeyframeLabs's emotional avatars. Nice job.

English

Mati Staniszewski@matiii·10 Şub

Passing the Turing test for voice agents. We just shipped a major ElevenAgents update: - Lower-latency, smoother turn-taking with new conversational model - Expressive Mode for contextual emotional delivery - Available in 70+ languages

English

162

165

2.6K

277.8K

Kaahan Radia@kradisme·12 Şub

@gabriberton Did a little bit of this at Zipline, harder than you think to stop even a medium capacity ResNets from zero-ing out the gradient reversal layer's impact. It would bifurcate it's own feature representations!

English

Gabriele Berton@gabriberton·11 Şub

A little more info on Domain Adaptation: the task is that you would have a labelled train set of one "source" domain (e.g. daytime images) and an unlabelled set from the test/target domain (e.g. night images). [1/N]

Gabriele Berton@gabriberton

Writing this gave me flashbacks of when CLIP came out. Part of my lab was working on Domain Adaptation, i.e. adapting models to unseen domains. CLIP killed that field CLIP has seen everything, suddenly there was this model with no unseen domain. [1/2]

English

10.5K

Kaahan Radia@kradisme·12 Şub

@KBlueleaf Smells like FSQ shenanigans, maybe a residual variant? Cool stuff.

English

琥珀青葉@KohakuLab@KBlueleaf·11 Şub

30k step from scratch, no GAN training F16 VQ-VAE with effective 2^64 codebook size, 512 emb dim, trainable param for VQ is 66K only

English

109

7.5K

Kaahan Radia@kradisme·18 Ara

@unilightwf Feels like there’s an obvious extension for TTS — reminds me, in spirit, of the similarity scoring Tortoise did.

English

155

Wen-Chin Huang@unilightwf·17 Ara

While everyone is amazed by SAM audio, the hidden gem to me is the SAM Audio Judge! SAM Audio judge assesses how well a separated audio matches a given text description in terms of (1) overall quality (2) recall (3) precision (4) faithfulness. huggingface.co/facebook/sam-a…

English

218

11.6K

Kaahan Radia retweetledi

Zipline@zipline·9 Nis

The future of delivery has arrived

English

159

968

283.1K

Keşfet

@ycombinator @DatostApp @MaceoCk @datostapp @maceock @jasonhywang @crystalwizard @pika_labs