Tengda Han

133 posts

Tengda Han

Tengda Han

@TengdaHan

Research Scientist @GoogleDeepMind. Previously PhD @Oxford_VGG

Oxford, England Katılım Mart 2019
598 Takip Edilen1.5K Takipçiler
Tengda Han retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
Gemini 3.1 Pro is here. We’ve significantly improved the model’s overall intelligence so it can solve tougher problems. 🧵
GIF
English
287
751
6.3K
916K
Peter Tong
Peter Tong@TongPetersb·
We have been training with TPUs in academia for two years now (huge thanks to Google TRC!). Works like Cambrian-1, Cambrian-S, RAE, and Scale-RAE would not have been possible without TPUs. We wrote a blog post sharing our experiences, optimizations, and lessons learned: cambrian-mllm.github.io/blog/tpu-train… We hope this can help more people having a smoother experience working with TPUs, they are very powerful!
English
9
25
269
37.4K
Tengda Han retweetledi
Sayna Ebrahimi
Sayna Ebrahimi@SaynaEbrahimi·
I’m looking for PhD students in Audio & Video for a Summer 2026 internship at Google DeepMind! ⚠️ Requirement: Prior publication in this area. To apply, tell me the most critical research gap in AV understanding to see if we are a match! docs.google.com/forms/d/1qTvfE…
English
1
19
126
11.5K
Tengda Han retweetledi
Weidi Xie
Weidi Xie@WeidiXie·
🚀 Glad to share the exciting project — SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass! We explored the generation of 3D scenes with multiple assets from a single image. 🎉 ACCEPTED by 3DV 2026!!! All resources have been open-sourced and publicly available! 📄 Paper: arxiv.org/abs/2508.15769 💻 Code: github.com/Mengmouxu/Scen… 🔗 Model: huggingface.co/haoningwu/Scen… 🌐 WebPage: mengmouxu.github.io/SceneGen #3DVision #AI #GenerativeAI #ComputerVision #3DV2026 #SceneGen
Weidi Xie tweet media
English
1
2
7
815
Tengda Han retweetledi
joao carreira
joao carreira@joaocarreira·
Future AI models will learn predominantly post-deployment – to do the tasks of interest to each user. This will happen throughout an individual “life”. In a new paper arxiv.org/pdf/2512.04085 we lay out groundwork for this type of capabilities in the wild from a visual standpoint.
Tengda Han@TengdaHan

Work from @SaynaEbrahimi, myself, and @dilaragoekay, @goolygu, Maks Ovsjanikov, Iva Babukova, @DanielZoran_ , Viorica Patraucean, @joaocarreira , Andrew Zisserman and @dimadamen at @GoogleDeepMind. Arxiv: arxiv.org/abs/2512.04085

English
2
4
15
2.4K
Tengda Han
Tengda Han@TengdaHan·
Human learns from unique data -- everyone's OWN life -- but our visual representations eventually align. In our recent work "Unique Lives, Shared World" @GoogleDeepMind, we train models with "single-life" videos from distinct sources, and study their alignment and generalisation.
Tengda Han tweet mediaTengda Han tweet mediaTengda Han tweet media
English
10
30
147
12.6K
Tengda Han
Tengda Han@TengdaHan·
Human perception is active: we move around to see, and we see with intention. In our latest work "Seeing without Pixels", we find "how you see" (how the camera moves) roughly reveals "what you do" or "what you observe" -- and this connection can be easily learned from data.
English
2
18
165
20.8K
Tengda Han
Tengda Han@TengdaHan·
Animated movies can be effortlessly understood by young minds, but appear to be challenging for video-language models, why? The key problem is the huge diversity of animated characters -- their appearance ranges from human-like faces, to cars, fish, blobs, etc.
English
1
3
13
2K
Tengda Han
Tengda Han@TengdaHan·
The SLoMo workshop on "Story-level Movie Understanding & Audio Description" will be on #ICCV2025 day-1 morning, starting at 8:40 AM at Room 327! @JunyuXieArthur, @maxhbain and Xi will be there in person. See you tomorrow @ICCVConference !! #iccv25
Tengda Han tweet media
Tengda Han@TengdaHan

Being able to understand, describe and even enjoy movies is one of the pinnacles of computer vision. Interested in movie understanding and audio description? Check out our SLoMo workshop at @ICCVConference #ICCV2025!!

English
1
6
47
9.9K
Tengda Han
Tengda Han@TengdaHan·
Being able to understand, describe and even enjoy movies is one of the pinnacles of computer vision. Interested in movie understanding and audio description? Check out our SLoMo workshop at @ICCVConference #ICCV2025!!
Junyu Xie@JunyuXieArthur

Movies are more than just video clips, they are stories! 🎬 We’re hosting the 1st SLoMO Workshop at #ICCV2025 to discuss Story-Level Movie Understanding & Audio Descriptions! Website: slomo-workshop.github.io Competition: huggingface.co/spaces/SLoMO-W…

English
0
2
20
9.9K