Antonio Alonso-Stepanov

6 posts

Antonio Alonso-Stepanov

@antonioalst

@stanford

Katılım Mart 2020

825 Takip Edilen122 Takipçiler

Antonio Alonso-Stepanov@antonioalst·10 Nis

@matthewnoto73 🔥

QME

Matthew Noto@matthewnoto73·10 Nis

Low-quality 4-bit attention has been blocking end-to-end FP4 serving, and we take a major step toward fixing that 🔥 In Attn-QAT, we made attention compatible with quantization-aware training. This gives us FP4 attention with quality on par with BF16 attention for both language and video models while being 1.1x–1.5x faster than SageAttention3 on an RTX 5090, and up to 1.39x faster than FlashAttention-4 on a B200. Blog: haoailab.com/blogs/attn-qat/

Hao AI Lab@haoailab

(1/5) FP4 hardware is here, but 4-bit attention still kills model quality, blocking true end-to-end FP4 serving. To fix that, we propose Attn-QAT, the first systematic study of quantization-aware training for attention. The result: FP4 attention quality is comparable to BF16 attention with 1.1x–1.5x higher throughput than SageAttention3 on an RTX 5090 and 1.39x speedup over FlashAttention-4 on a B200. Blog: haoailab.com/blogs/attn-qat/ Code: github.com/hao-ai-lab/Fas… Checkpoints: huggingface.co/FastVideo/14B_…

English

5.6K

Antonio Alonso-Stepanov@antonioalst·18 Mar

@matthewnoto73 🏎️🏎️

QME

Matthew Noto@matthewnoto73·18 Mar

Dreamverse is our AI video engine that generates a video faster than you can watch it. 30s of 1080p video in 4.5 seconds. One GPU. Real-time editing. This is vibe-directing. dreamverse.fastvideo.org

Hao AI Lab@haoailab

(1/N) We're launching Dreamverse. Most AI video models take minutes to generate a 5 s 1080p clip. In 4.5 seconds, we can generate 30 s 1080p clips on a single GPU. Our videos generate faster than you can watch them: stop waiting on prompts and start directing scenes live. 🕹️Demo: dreamverse.fastvideo.org 📑 Blog: haoailab.com/blogs/dreamver… Welcome to the era of vibe-directing 👇

English

4.9K

Antonio Alonso-Stepanov@antonioalst·14 Mar

@matthewnoto73 Awesome stuff!

English

123

Matthew Noto@matthewnoto73·13 Mar

⚡️We built a new real-time inference stack in FastVideo and have the fastest 1080p TI2AV (text + image to audio and video) pipeline ever. Create a 5 s 1080p video with audio in ~4.55 s on a single GPU! High-quality video generation must be fast to be truly interactive. The only limit in creative workflows should be your imagination. If you have the need for speed (and quality), make video generation go blurrr (for free) at 1080p.fastvideo.org and create whatever you can imagine…

Hao AI Lab@haoailab

(1/N) Content creators have been stuck with costly and slow video generation APIs for far too long. We couldn’t take it anymore.😅😭 FastVideo’s new real-time inference stack has the fastest 1080p TI2AV pipeline ever.😍🚀🚀 Our optimized LTX-2.3 pipeline creates 5-second 1080p videos with audio in 4.55 s, on a single GPU! 3.9x faster than the next fastest option. 🕹️Live demo: 1080p.fastvideo.org 📜Blog: haoailab.com/blogs/fastvide…

English

7.7K

Antonio Alonso-Stepanov retweetledi

Nathan Zhao@nathanzhaoo·19 Şub

Had fun this weekend at @hackwithtrees getting Most Impactful Hack and 1st on the overall Artificial Intelligence track with @OpenAI :) We made AR glasses that reconstruct rooms and conduct live object-tracking, giving voice agents access to spatial navigation and medical information to guide dementia patients in finding lost items. cc: victor @madhuhaasg @antonioalst

English

2.6K

Antonio Alonso-Stepanov@antonioalst·26 Şub

Larry Summers speech interrupted at Stanford

English

3.4K

Antonio Alonso-Stepanov@antonioalst·12 Eyl

The legend himself @ilyasut spotted in the wild

English

533

Keşfet

@matthewnoto73 @hackwithtrees @OpenAI @madhuhaasg @ilyasut @elonmusk @BarackObama @taylorswift13