Lu Jiang

45 posts

Lu Jiang banner
Lu Jiang

Lu Jiang

@roadjiang

Research Scientist @GoogleAI #GoogleResearch. Adjunct Faculty @CarnegieMellon.

Mountain View, CA Katılım Ekim 2010
132 Takip Edilen582 Takipçiler
Lu Jiang retweetledi
Gordon Wetzstein
Gordon Wetzstein@GordonWetzstein·
How do we generate videos on the scale of minutes, without drifting or forgetting about the historical context? We introduce Mixture of Contexts. Every minute-long video below is the direct output of our model in a single pass, with no post-processing, stitching, or editing. 1/4
English
22
98
609
151.2K
OverPowered
OverPowered@OverPowere13959·
@CeyuanY With these many video models from bytedance, are any of them going to be open sourced?
English
1
0
2
54
Ceyuan Yang
Ceyuan Yang@CeyuanY·
Humans interact with this world in real-time. Our latest APT2 makes this happen in video foundation model. Now you can explore the generative world by controlling 6DoF camera poses with negligible latency. Check out more cool stuff at seaweed-apt.com/2
Peter Lin@peter9863

Introducing Seaweed APT2, a real-time, interactive, streaming video generation model. seaweed-apt.com/2 Adversarial training for autoregressive modeling! Streaming 1 minute videos, 1 diffusion step, 24fps real-time on 1xh100, with interactive controls!

English
5
2
28
2.8K
Lu Jiang
Lu Jiang@roadjiang·
@katedeyneka Thank you for attending. I am glad that you like it.
English
0
0
1
53
Kate Deyneka
Kate Deyneka@katedeyneka·
Just attended a great talk by @roadjiang at #CVPR2025 on Cost-Effective Training of Video Generation Foundation Model. I liked it because it summarized really well the core techniques for optimal and high quality video generation. Here’s a quick breakdown of key insights from the presentation👇🏻 📄 Paper: seaweed.video
Kate Deyneka tweet media
English
2
0
6
5.8K
Lu Jiang retweetledi
Ceyuan Yang
Ceyuan Yang@CeyuanY·
Glad to share Seaweed-7B, a cost-effective foundation model for video generation. Our tech report highlights the key designs that significantly improve compute efficiency and performance given limited resources, achieving comparable quality against other industry-level models. To unleash the power of the foundation model, Seaweed-7B further enables a wide range of downstream applications including image-to-video generation, human video generation, subject-consistent video generation, video-audio joint generation, long video generation and storytelling, real-time generation, super-resolution generation, camera controlled generation. Check out our webpage and report for more details: Webpage: seaweed.video Paper: seaweed.video/seaweed.pdf It's a wonderful journey of the last year. Thanks to all teammates for their contributions, sincerely.
English
34
95
516
77.2K
Lu Jiang
Lu Jiang@roadjiang·
@rtk254 Ronen, interesting discussion! We recently have a work showing that training on synthetically generated CGI videos can indeed help models learn to generate videos that better respect physical constraints: kevinz8866.github.io/simulation/ @ronen
English
0
0
2
233
Ronen Tamari
Ronen Tamari@rtk254·
Video models != world models "We find that across a range of current models (Sora, Runway, Pika, Lumiere, Stable Video Diffusion, and VideoPoet), physical understanding is severely limited, and unrelated to visual realism"
Ronen Tamari tweet media
English
18
124
887
175.8K
Lu Jiang retweetledi
AK
AK@_akhaliq·
Synthetic Video Enhances Physical Fidelity in Video Synthesis A turtle swimming in a green background. + video matting illustration
English
4
14
100
16.6K
Lu Jiang
Lu Jiang@roadjiang·
@dreamingtulpa Thanks for reporting our work and discussion. Like mentioned in the paper's abstract: while the model still lacks a deep understanding of physics, it offers one of the first empirical demonstrations that synthetic video enhances physical fidelity in video synthesis.
English
0
0
0
57
Dreaming Tulpa 🥓👑
Dreaming Tulpa 🥓👑@dreamingtulpa·
better real-world physics are coming to video models thanks to synthetic video data
English
19
17
139
16.1K
Lu Jiang retweetledi
AK
AK@_akhaliq·
Seaweed APT Diffusion Adversarial Post-Training for One-Step Video Generation Existing diffusion and autoregressive generative models require repeated neural network evaluations. It is extremely slow for the high-resolution video generation task, as a few-second video can take many minutes to generate. Our work is the first to demonstrate the generation of an entire video using a single neural function evaluation (1NFE) by using our proposed adversarial post-training technique. Our model generates 2 seconds of 1280x720 24fps videos in real-time. We showcase some of the results below:
English
9
34
202
21.2K
Lu Jiang
Lu Jiang@roadjiang·
@anuaakash VideoPoet co-author here. Thanks a ton! Due to policy constraints, we weren't able to perform such comparisons. Your analysis is incredibly helpful and reinforces my belief that VideoPoet excels in creating larger motions. Its per frame quality can be further improved.
English
1
0
1
36
Anu Aakash
Anu Aakash@anuaakash·
Google VideoPoet, Runway, Pika & Genmo Google recently announced Video Poet. Google's VideoPoet is a large language model (LLM) that is capable of a wide variety of video generation tasks, including: - text-to-video - image-to-video - video stylization - video inpainting and outpainting - video-to-audio. I tried some of their text-to-image prompts (from their demo) in Pika, Runway and Genmo. Here are the results: 10 examples 1/10 Two teddy bears holding hands, walking down rainy 5th avenue.
English
11
109
400
50K
Lu Jiang retweetledi
Agrim Gupta
Agrim Gupta@agrimgupta92·
We introduce W.A.L.T, a diffusion model for photorealistic video generation. Our model is a transformer trained on image and video generation in a shared latent space. 🧵👇
English
49
248
1.2K
431K
Lu Jiang
Lu Jiang@roadjiang·
😲While preparing the meta-review for #aaai24, I stumbled upon a new form of parallelism. It wasn't about the paper's concepts, but rather in the review comments, where two reviewers listed identical comments, word for word, over 200 matching words. #PeerReview #AIResearch
GIF
English
1
0
4
852
Lu Jiang
Lu Jiang@roadjiang·
📢 Call for Papers! International Journal of Computer Vision (IJCV) invites submissions for its special issue on "Generative Models for Content Creation and Manipulation." 🗓️ Manuscript Submission Deadline: February 28, 2024 🔗 Check it out here: springer.com/journal/11263/…
English
0
1
4
534
Lu Jiang
Lu Jiang@roadjiang·
@k_saifullaah It seems relevant and a common problem we can try to reduce the human-supervision. Thanks for sharing!
English
0
0
1
104
khalid
khalid@k_saifullaah·
@roadjiang Our paper "Seeing in Words" might be of interest, where we use LLM (lang. bottleneck) to serve as a universal interface for image classification. It's truly exciting to see that this kind of approach also demonstrates effectiveness in image reconstruction arxiv.org/abs/2307.00028
English
1
1
6
1K
Lu Jiang
Lu Jiang@roadjiang·
Fascinating research by Google reveals the power of Language Models (LLMs) like PaLM or GPT in tackling visual tasks using in-context learning. This novel method enables LLMs to perform image generation tasks without requiring any parameter updates. #palm #GPT4 #LLMs
Lu Jiang tweet media
English
2
67
250
149.8K