Size Wu

5 posts

Size Wu

Size Wu

@WuSize

PhD student@NTU

Singapore شامل ہوئے Şubat 2022
47 فالونگ49 فالوورز
پن کیا گیا ٹویٹ
Size Wu
Size Wu@WuSize·
🔥 We release Harmon: a unified framework for multimodal understanding & generation with a shared visual encoder (vs. decoupled Janus/-Pro). 💥 SOTA on GenEval, MJHQ, WISE 🧠 Strong understanding performance 📄 Paper: huggingface.co/papers/2503.21… 🔗 Code: github.com/wusize/Harmon
English
2
1
17
2.5K
Size Wu
Size Wu@WuSize·
🔥 We release Harmon: a unified framework for multimodal understanding & generation with a shared visual encoder (vs. decoupled Janus/-Pro). 💥 SOTA on GenEval, MJHQ, WISE 🧠 Strong understanding performance 📄 Paper: huggingface.co/papers/2503.21… 🔗 Code: github.com/wusize/Harmon
English
2
1
17
2.5K
Aran Komatsuzaki
Aran Komatsuzaki@arankomatsuzaki·
Unveiling Encoder-Free Vision-Language Models Achieves smaller performance-compute gap between encoder-based VLM and decoder-only VLM arxiv.org/abs/2406.11832
Aran Komatsuzaki tweet media
English
6
41
174
25.7K
Size Wu
Size Wu@WuSize·
🌟Our paper proposes a clever and novel approach to Open-Vocabulary Object Detection. Come and check it in this afternoon #CVPR2023 ! ​ 📍 Location: West Exhibit Hall #276 ⏰ Time: 04: 30 PM - 06:00 PM Paper: arxiv.org/abs/2302.13996
Size Wu tweet mediaSize Wu tweet mediaSize Wu tweet media
English
0
7
26
1.9K