Xianghao Kong

64 posts

Xianghao Kong banner
Xianghao Kong

Xianghao Kong

@xk_theo7

Video Gen AI Researcher📍Bay Area | PhDone @UCR_CSE | interpretability, alignment, compositionality of diffusion models | EX - @AdobeFirefly, @SonyAI_global

Riverside, CA انضم Ağustos 2021
391 يتبع139 المتابعون
تغريدة مثبتة
Xianghao Kong
Xianghao Kong@xk_theo7·
1/8 🚀 AI Breakthrough: "Interpretable Diffusion via Information Decomposition" 🧠 - Quantitative understanding of conditional diffusion models. - Align text-image data using mutual information. - Goes beyond "attention". 🎉 Accepted at #ICLR2024!
Xianghao Kong tweet media
English
1
2
17
6.2K
Xianghao Kong أُعيد تغريده
Yunong Liu
Yunong Liu@yunongliu1·
Really excited to see Uni-1 out in the world 🔥Our first unified model. The range of things this model can do is wild: image-to-~100 styles, manga generation, multi-ref with strong identity preservation, temporal storytelling, sketch-to-image, spatial reasoning, multilingual infographics, layering… the capability range is honestly unreal. this is just the start 🫡 check out the blog to learn more lumalabs.ai/uni-1 Proud of the team and what we’re building at @LumaLabsAI 🚀
Yunong Liu tweet mediaYunong Liu tweet mediaYunong Liu tweet mediaYunong Liu tweet media
Luma@LumaLabsAI

Introducing Uni-1, Luma’s first unified understanding and generation model, our next step on the path towards unified general intelligence. lumalabs.ai/uni-1

English
2
8
62
6.2K
Xianghao Kong
Xianghao Kong@xk_theo7·
@hudsonyeoce Cool cool! We’ve mastered alignment for nouns in image/video models, but verbs (or more abstract terms) are the real challenge in video. Seeing this kind of motion control proves Runway’s cracking the code on abstract concepts🔥
English
0
0
1
33
Hudson
Hudson@hudsonyeoce·
@xk_theo7 yes!! the prompt adherence in this model is something we really optimised for 😁
English
1
0
0
37
Xianghao Kong
Xianghao Kong@xk_theo7·
I’m currently in transit to San Diego for NeurIPS. If you’re also killing time, feel free to check out a 2-minute-30-second horror sci-fi short film Michael and I recently created. We’d love any comments or likes: devpost.com/software/dream… Looking forward to catching up at the venue! 🎥
English
0
0
0
209
Xianghao Kong
Xianghao Kong@xk_theo7·
I feel the debate shouldn’t only be about whether DiT is effective, but also about how information preservation is the key to accelerating diffusion training. Our MicroDiT (arxiv.org/abs/2407.15811) paper showed this: by letting masked token info mix into unmasked ones, we can cut down a lot of tokens with only minor performance loss. Interestingly, two months ago, when I caught up with @StefanABaumann at #CVPR, we discussed how TREAD and MicroDiT are conceptually similar from info perspective. Maybe it’s time to look at diffusion through an information-theoretic lens: from post-training (for the better alignment) to latent space curation, I believe this could lead to some really exciting discoveries!
サメQCU@sameQCU

bros, DiT is wrong. it's mathematically wrong. it's formally wrong. there is something wrong with it

English
2
2
17
1.8K
Jonathan Fischoff
Jonathan Fischoff@jfischoff·
In some other life, I'm a Russian mob boss
Jonathan Fischoff tweet media
English
9
0
23
966
Xianghao Kong أُعيد تغريده
Reka
Reka@RekaAILabs·
Excited to introduce Reka Vision, an agentic visual understanding and search platform. Transform your unstructured multimodal data into insights and actions.
English
7
23
118
485.8K
Xianghao Kong أُعيد تغريده
Midjourney
Midjourney@midjourney·
Introducing our V1 Video Model. It's fun, easy, and beautiful. Available at 10$/month, it's the first video model for *everyone* and it's available now.
English
359
594
3.9K
1.9M
Xianghao Kong
Xianghao Kong@xk_theo7·
Heading to Nashville 🎸 for @CVPR (06/11 - 06/16)! Always excited to catch up with old friends and make new connections. Let’s grab a coffee ☕️ or chat about diffusion models, post-training, or just life! #CVPR2025 #Diffusion #GenerativeAI #Nashville
English
0
0
3
289
Xianghao Kong أُعيد تغريده
David
David@DavidSHolz·
you're now closer to the year 2050 than the year 2000
English
56
95
1.2K
79.1K
Xianghao Kong
Xianghao Kong@xk_theo7·
@tunahansalih @amazon Congrats on staying in SF for a cool summer! Don’t forget to grab a slice at Tony’s Pizza 🍕 in downtown
English
1
0
1
141
Tuna Meral
Tuna Meral@tunahansalih·
Starting Monday, I’ll be joining the @amazon AGI team as an Applied Scientist Intern. I’ll be working on something exciting that builds on my research in vision generative AI. Grateful for the opportunity and excited for what’s ahead. I’ll be in San Francisco all summer, let me know if you want to grab coffee!
English
1
0
23
820
Xianghao Kong أُعيد تغريده
Jinghan Yao
Jinghan Yao@JinghanYao·
📢 Our paper "Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer" has been accepted to hashtag#MLSys2025, taking place May 12-15! Excited to share our research at the intersection of machine learning and systems in San Jose, CA. 🎉 Check out the full program here: lnkd.in/ecyzGbwJ hashtag#MLSys hashtag#MachineLearning hashtag#Systems hashtag#Conference
Jinghan Yao tweet media
English
0
3
8
1.3K
Xianghao Kong
Xianghao Kong@xk_theo7·
@alec_helbling That's dope🔥! We found mutual info can discover the same thing. And it's beyond attention and model-agnostic!
Xianghao Kong tweet media
English
0
0
3
108
Alec Helbling
Alec Helbling@alec_helbling·
Diffusion Transformers aren't just generative models, but also powerful multi-modal encoders. ConceptAttention creates rich heatmaps of text concepts in images from DiT representations. This even works on real images, and can be applied to tasks like segmentation! Demo 👇
English
10
55
356
24.4K
Xianghao Kong أُعيد تغريده