Ge Ya (Olga) Luo

22 posts

Ge Ya (Olga) Luo banner
Ge Ya (Olga) Luo

Ge Ya (Olga) Luo

@OOOOLGAluo

Montréal, Québec Katılım Şubat 2014
91 Takip Edilen58 Takipçiler
Ge Ya (Olga) Luo retweetledi
Animesh Karnewar PhD
Animesh Karnewar PhD@AnimeshKarnewar·
AI video generation is poised to be the next revolution, but its heavy computational demands limit real-world deployment. Excited to share Neodragon, my first project after the PhD — a significant step toward efficient, on-device video generation. Webpage: qualcomm-ai-research.github.io/neodragon
English
3
14
148
42.3K
Ge Ya (Olga) Luo retweetledi
Saba
Saba@Saba_A96·
We built a new 𝗮𝘂𝘁𝗼𝗿𝗲𝗴𝗿𝗲𝘀𝘀𝗶𝘃𝗲 + 𝗥𝗟 image editing model using a strong verifier — and it beats SOTA diffusion baselines using 5× less data. 🔥 𝗘𝗔𝗥𝗟: a simple, scalable RL pipeline for high-quality, controllable edits. 🧵1/
Saba tweet mediaSaba tweet media
English
3
26
68
10.8K
Ge Ya (Olga) Luo retweetledi
el.cine
el.cine@EHuanglu·
omg.. this cant be real China’s 4DV AI just dropped 4D Gaussian Splatting, you can turn 2D video into 4D with sound.. imagine.. we will be able to change camera angle, zoom in/out while watching movies 5 examples:
English
720
3.8K
35.9K
3.6M
Ge Ya (Olga) Luo retweetledi
Anthony Gosselin
Anthony Gosselin@antho_gosselin·
🚗💥Introducing Ctrl-Crash: controllable video generation for autonomous driving! SOTA models struggle to generate physically realistic car crashes. We propose an image2video diffusion model with bounding box and crash type control. Website: anthonygosselin.github.io/Ctrl-Crash-Pro… 🧵->
English
2
13
24
7.7K
Anthony Gosselin
Anthony Gosselin@antho_gosselin·
The generated clips are 1) Left 2) Right 3) Left Details such as text and bluriness may give it away. However, Ctrl-Crash produces realistic physical details such as cars shaking and items on the dash moving on impact. We trust that better visuals can be achieved through scaling
English
1
0
2
147
Ge Ya (Olga) Luo
Ge Ya (Olga) Luo@OOOOLGAluo·
Yes! As generated samples approach dataset quality, accurate distribution distance measurement necessitates larger sample sizes and refined feature spaces, highlighting sample efficiency and feature space quality as key drivers of metric reliability.
Songwei Ge@Songwei_Ge

Though video generative models have made impressive progress, their automatic evaluation metric is still falling behind! Glad to see analysis and advances in video generation evaluation.

English
0
0
4
279
Ge Ya (Olga) Luo retweetledi
Ge Ya (Olga) Luo retweetledi
Hassan Al-Farhan
Hassan Al-Farhan@HAF_tech·
@jm_alexia I'm sold on JEDi. FVD has been frustratingly limited for video gen models. Great to see innovation like this pushing the field forward.
English
0
2
2
408
Ge Ya (Olga) Luo retweetledi
Benno Krojer
Benno Krojer@benno_krojer·
AURORA 🌌 is now accepted as a Spotlight at NeurIPS 🥂 We wondered if a model can do *controlled* video generation but in a *single* step? So we built a dataset+model for “taking actions” on images via editing, or what you could call single-step controlled video gen
Benno Krojer@benno_krojer

Did you miss the recent Auroras? No problem! ✨🎆 Super excited to share AURORA, a *general* image editing model + high-quality data that improves where prev work fails the most: Performing *action or movement* edits, i.e. a kind of world model setup Insights/Details ⬇️

English
5
17
82
18.9K
Ge Ya (Olga) Luo
Ge Ya (Olga) Luo@OOOOLGAluo·
Huge thanks to @mufan_li and @benno_krojer for sharing your expertise and feedback! Additional kudos to @Songwei_Ge for his pioneering research and expert guidance on establishing the VideoMAE experiment framework. 💐
English
0
0
3
224
Ge Ya (Olga) Luo retweetledi
Alexia Jolicoeur-Martineau
From Meta Movie Gen paper: "automated metrics such as FVD and IS do not correlate with human evaluation scores for video quality, and do not provide useful signal for model development or comparison" Funny, because we actually solved this exact problem! New metric coming soon!😎
Ishan Misra@imisra_

So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!

English
3
4
78
9.8K
Ge Ya (Olga) Luo retweetledi
Rabiul Awal
Rabiul Awal@_rabiulawal·
🚀 Introducing VisMin arxiv.org/abs/2407.16772 – a benchmark for Visual Minimal Change Understanding! Evaluates VLMs' fine-grained understanding of objects, attributes, relationships, and counting. Code, models & datasets at vismin.net 📷📷[1/13]🧵
Rabiul Awal tweet media
English
3
22
55
17.3K
Alexia Jolicoeur-Martineau
Alexia Jolicoeur-Martineau@jm_alexia·
7 years ago we left the parent's basement for a tiny 400sqft apartment. Today, we closed on our dream home, in the city, at walking distance from work and groceries, no compromise!
Alexia Jolicoeur-Martineau tweet media
English
9
2
110
17.3K
Ge Ya (Olga) Luo retweetledi
Luke Rowe
Luke Rowe@Luke22R·
How can we generate interesting edge cases to test our autonomous vehicles in simulation? We propose CtRL-Sim, a novel framework for closed-loop behaviour simulation that enables fine-grained control over agent behaviours. 🧵 1/8 arxiv.org/abs/2403.19918
GIF
English
1
14
36
6.7K