Tan Wang (@Wangt97) - Twitter Profili | Zamantika Mersobahis Locabet

Tan Wang@Wangt97·11 Haz

@vivid_en Please reply to my email!!!! Contact so many times to your custom service but get no response

English

0

2

42

Tan Wang@Wangt97·26 Kas

@wandererkitty @LINJIEFUN Hi, thanks for your interests! Could u please try pip install colorlog first?

English

1

0

34

Linjie (Lindsey) Li@LINJIEFUN·6 Tem

I am humbled to be re-featured as Women in Computer Vision for the BEST of CVPR section of the Computer Vision News July Magazine. It was great chatting with Ralph Anzarouth. I hope my unconventional career path can encourage more female researchers. rsipvision.com/ComputerVision…

English

3

5

134

19.5K

Tan Wang@Wangt97·8 Eki

Have a quick try using LLaVA v1.5 with a sample in our proposed VLM benchmark EqBen (ICCV2023 Oral), which seems still struggle😂😂😂… So… There are still many things to do! github.com/Wangt-CN/EqBen

Rowan Cheung@rowancheung

🚨 BREAKING: GPT-4 image recognition already has a new competitor. Open-sourced and completely free to use. Introducing LLaVA: Large Language and Vision Assistant. I compared the viral parking space photo on GPT-4 Vision to LLaVa, and it worked flawlessly (see video).

Zurich, Switzerland 🇨🇭 English

0

1

471

Tan Wang@Wangt97·2 Eki

- EqBen still cannot be solved by existing open-sourced MLLM! - All the benchmark data and code are open-sourced. Arxiv: arxiv.org/abs/2303.14465 Github: github.com/Wangt-CN/EqBen

English

0

116

Tan Wang@Wangt97·2 Eki

Meet us at 5th (Thu) 4:30-6pm (Room Paris Nord) for Oral session or 2:30-4:30pm (Room Foyer Sud) for Poster!

English

1

0

132

Tan Wang@Wangt97·2 Eki

Excited to share our EqBen/EqSim, a new benchmark/algorithm focusing on evaluating and improving the similarity measure of V&L foundation models, to be presented in Oral session at #ICCV2023 in Paris! Joint work w/ @linkeyun2 @LINJIEFUN CC Lin, ZY Yang, HW Zhang, ZC Liu, LJ Wang.

Paris, France 🇫🇷 English

1

2

9

1.2K

Tan Wang@Wangt97·9 Tem

I will be in @icvss for the upcoming week presenting our DisCo (disco-dance.github.io), with interactive demo to turn static images into human dancing videos! Big thanks to @GMFarinella , @robertocipolla , @sebattiato for organizing ICVSS.

AK@_akhaliq

DisCo: Disentangled Control for Referring Human Dance Generation in Real World paper page: huggingface.co/papers/2307.00… Generative AI has made significant strides in computer vision, particularly in image/video synthesis conditioned on text descriptions. Despite the advancements, it remains challenging especially in the generation of human-centric content such as dance synthesis. Existing dance synthesis methods struggle with the gap between synthesized content and real-world dance scenarios. In this paper, we define a new problem setting: Referring Human Dance Generation, which focuses on real-world dance scenarios with three important properties: (i) Faithfulness: the synthesis should retain the appearance of both human subject foreground and background from the reference image, and precisely follow the target pose; (ii) Generalizability: the model should generalize to unseen human subjects, backgrounds, and poses; (iii) Compositionality: it should allow for composition of seen/unseen subjects, backgrounds, and poses from different sources. To address these challenges, we introduce a novel approach, DISCO, which includes a novel model architecture with disentangled control to improve the faithfulness and compositionality of dance synthesis, and an effective human attribute pre-training for better generalizability to unseen humans. Extensive qualitative and quantitative results demonstrate that DISCO can generate high-quality human dance images and videos with diverse appearances and flexible motions.

English

1

2

8

3.4K

Tan Wang@Wangt97·4 Tem

@LINJIEFUN, @linkeyun2, Chung-Ching Lin, Zhengyuan Yang, Hanwang Zhang, Zicheng Liu, @lijuanwang666

Filipino

0

127

Tan Wang@Wangt97·4 Tem

Github Page: github.com/Wangt-CN/DisCo Arxiv: arxiv.org/abs/2307.00040 Huggingface: huggingface.co/papers/2307.00… Youtube: youtu.be/alJKsj3JpBo

YouTube

English

1

0

104

Tan Wang@Wangt97·4 Tem

Thx @_akhaliq! Check out our DisCo at disco-dance.github.io.🔥🔥🔥 🧙‍♂️High Generalizability. No need human-specific fine-tuning! 💃Extensive human-related applications with disentangled control! 👨‍💻Easy-to-follow framework and totally opensource code!

AK@_akhaliq

DisCo: Disentangled Control for Referring Human Dance Generation in Real World paper page: huggingface.co/papers/2307.00… Generative AI has made significant strides in computer vision, particularly in image/video synthesis conditioned on text descriptions. Despite the advancements, it remains challenging especially in the generation of human-centric content such as dance synthesis. Existing dance synthesis methods struggle with the gap between synthesized content and real-world dance scenarios. In this paper, we define a new problem setting: Referring Human Dance Generation, which focuses on real-world dance scenarios with three important properties: (i) Faithfulness: the synthesis should retain the appearance of both human subject foreground and background from the reference image, and precisely follow the target pose; (ii) Generalizability: the model should generalize to unseen human subjects, backgrounds, and poses; (iii) Compositionality: it should allow for composition of seen/unseen subjects, backgrounds, and poses from different sources. To address these challenges, we introduce a novel approach, DISCO, which includes a novel model architecture with disentangled control to improve the faithfulness and compositionality of dance synthesis, and an effective human attribute pre-training for better generalizability to unseen humans. Extensive qualitative and quantitative results demonstrate that DISCO can generate high-quality human dance images and videos with diverse appearances and flexible motions.

English

2

6

8

3.8K

Tan Wang retweetledi

Jia-Bin Huang@jbhuang0604·22 Nis

How to network at a conference? Many students will be attending their first-ever *in-person* conference this year. How exciting! 🤩 Some tips on making the best of attending a conference. 🧵

English

22

302

1.5K

0

Tan Wang retweetledi

Aran Komatsuzaki@arankomatsuzaki·23 Mar

Incorporating Convolution Designs into Visual Transformers CeiT matches DeiT with 3x fewer iterations by adding three modifications to the architecture. arxiv.org/abs/2103.11816

English

1

17

53

0

Tan Wang retweetledi

AK@_akhaliq·24 Mar

Leveraging background augmentations to encourage semantic focus in self-supervised contrastive learning pdf: arxiv.org/pdf/2103.12719… abs: arxiv.org/abs/2103.12719