Shuyang (Kevin) Sun

46 posts

Shuyang (Kevin) Sun

Shuyang (Kevin) Sun

@Kevin_SSY

Researcher @GoogleDeepMind. DPhil (PhD) @OxfordTVG; Computer Vision, Multi-modal Modeling. Opinions are my own.

Katılım Mart 2020
154 Takip Edilen293 Takipçiler
Shuyang (Kevin) Sun retweetledi
Jon Barron
Jon Barron@jon_barron·
We have an important result to share: if you reduce multiple dense vision tasks into a single RGB-image-prediction task, fine-tuning a strong image generator (in our case Nano Banana Pro) matches or beats all specialized models for monodepth, normals, and semantic segmentation.
Songyou Peng@songyoupeng

Yay, finally! Introducing Vision Banana🍌 from @GoogleDeepMind, our unified model that outperforms SoTA specialist models on various vision tasks! By treating 2D/3D vision tasks as image generation, we unlock a new foundation for CV. Project page: vision-banana.github.io (1/5)

English
11
45
487
65.3K
Shuyang (Kevin) Sun retweetledi
Saining Xie
Saining Xie@sainingxie·
vision🍌 is here vision-banana.github.io if you got into computer vision the way I did, starting with pixel-level labeling tasks like segmentation, edges, depth, or surface normals, you’ll probably feel the same seeing these results -- something big has quietly shifted, and it’s going to change how we approach these problems for good 🧵
English
11
112
785
63K
Shuyang (Kevin) Sun
Shuyang (Kevin) Sun@Kevin_SSY·
Vision Banana is also advised and supported by Oliver Wang, Saining Xie, Howard Zhou, Kaiming He, Thomas Funkhouser, Jean-Baptiste ALAYRAC, and Radu Soricut. N/N
English
2
0
10
2.2K
Shuyang (Kevin) Sun
Shuyang (Kevin) Sun@Kevin_SSY·
Are we finally witnessing the GPT-3 moment for computer vision? We just dropped Vision Banana 🍌 , a vision foundation model that seamlessly unifies generation and perception by treating all vision tasks as just another image generation problem. 1/N #googledeepmind #nanobanana
English
9
12
195
24.7K
Shuyang (Kevin) Sun
Shuyang (Kevin) Sun@Kevin_SSY·
Vision Banana achieves state-of-the-art performance under the zero-shot transfer setting across both 2D and 3D vision tasks, proving that generative vision models are the ultimate generalist foundation. 2/N
English
0
1
18
2.2K
Shuyang (Kevin) Sun retweetledi
Shangbang Long
Shangbang Long@ShangbangLong·
🚀 Excited to announce Vision Banana 🍌 and our new paper: “Image Generators are Generalist Vision Learners”. We turn Nano Banana Pro into a state-of-the-art visual generation and understanding model. 🖼️ Check out our gallery at vision-banana.github.io 🧵 (1/N) continue ⬇️
English
21
71
430
59K
Shuyang (Kevin) Sun
Shuyang (Kevin) Sun@Kevin_SSY·
Great work from our student researcher Jiageng Mao @PointsCoder to enable scalable robot learning by imitating AI-generated videos.
English
0
2
9
1.6K
Shuyang (Kevin) Sun retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
English
813
2.6K
13.4K
3.7M
Ferenc Huszár
Ferenc Huszár@fhuszar·
Great day in Oxford joining forces with @oiwi3000 as co-examiners for the DPhil viva of @csaba_botos Congrats Csaba for the great thesis and amazing live demos.
Ferenc Huszár tweet mediaFerenc Huszár tweet mediaFerenc Huszár tweet mediaFerenc Huszár tweet media
English
2
0
16
2.1K
Shuyang (Kevin) Sun
Shuyang (Kevin) Sun@Kevin_SSY·
I will be presenting our paper "Clip as RNN: Segment Countless Visual Concepts without Training Effort" at @CVPR poster #341, today (June 20th) at 10:30 am. If you're interested in open-vocabulary segmentation and multi-modal learning, please come and chat with me!
Shuyang (Kevin) Sun tweet media
English
1
3
15
1.2K
Shuyang (Kevin) Sun retweetledi
Ilya Sutskever
Ilya Sutskever@ilyasut·
After almost a decade, I have made the decision to leave OpenAI.  The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama, @gdb, @miramurati and now, under the excellent research leadership of @merettm.  It was an honor and a privilege to have worked together, and I will miss everyone dearly.   So long, and thanks for everything. I am excited for what comes next — a project that is very personally meaningful to me about which I will share details in due time.
English
1.4K
2.3K
25.5K
5.9M
Shuyang (Kevin) Sun
Shuyang (Kevin) Sun@Kevin_SSY·
Unfortunately, I won't be able to attend ICLR in person due to visa issues. However, if you're keen on using synthetic data for training, I'd love for you to visit our poster Real-fake on Friday morning!
Shuyang (Kevin) Sun tweet media
English
0
0
1
251
Shuyang (Kevin) Sun
Shuyang (Kevin) Sun@Kevin_SSY·
Another paper on using synthetic data for better image recognition. Distribution matching may not be all you need but it's very helpful to bridge the real-synthetic domain gap.
Jianhao Yuan@DYDYYDYYYD

🧐What makes good synthetic training data? 🚀Discover our #ICLR2024 work Real-Fake🚀Effective Training Data Synthesis Through Distribution Matching 🔥🔥🔥 🧵1/n 📝Paper: arxiv.org/abs/2310.10402 🔍Project: torrvision.com/realfake/ 📊 Synthetic Dataset: huggingface.co/datasets/Jianh…

English
1
0
3
227