Wenyan Li

40 posts

Wenyan Li banner
Wenyan Li

Wenyan Li

@Wenyan62

PhD student at the CoAStaL NLP Group, University of Copenhagen. Former researcher at Comcast AI and SenseTime.

Katılım Eylül 2020
201 Takip Edilen285 Takipçiler
Sabitlenmiş Tweet
Wenyan Li
Wenyan Li@Wenyan62·
Happy to share (with a bit of delay tho) our paper on quantifying visual information loss in VLMs --- "Lost in Embeddings: Information Loss in Vision-Language Models" is accepted to EMNLP 2025 findings: arxiv.org/pdf/2509.11986 💃code is also released: github.com/lyan62/vlm-inf…
Wenyan Li tweet media
English
8
36
308
29.9K
Abraham Owodunni
Abraham Owodunni@AbrahamOwos·
@Wenyan62 I really like this paper 👏👏. Read it and shared with some friends.
English
1
0
1
19
Wenyan Li
Wenyan Li@Wenyan62·
I will be presenting our Lost in embeddings poster at EMNLP! Hope to see many old and new friends in Suzhou!🤗🤗 📍Time/Date: Fri. Nov 7 at 12:30-13:30 Location: Hall C Also happy to chat anything about VLMs, RAG and recently get in the domain of fintech. #EMNLP2025
Wenyan Li@Wenyan62

Happy to share (with a bit of delay tho) our paper on quantifying visual information loss in VLMs --- "Lost in Embeddings: Information Loss in Vision-Language Models" is accepted to EMNLP 2025 findings: arxiv.org/pdf/2509.11986 💃code is also released: github.com/lyan62/vlm-inf…

English
2
4
37
7.4K
Himanshu Kumar
Himanshu Kumar@codewithimanshu·
@Wenyan62 Sounds exciting, Wenyan! Suzhou sounds lovely, and your work on VLMs is always top-notch, I must say.
English
1
0
1
55
Wenyan Li retweetledi
Raphael Tang
Raphael Tang@ralph_tang·
📢Our new paper critically examines arena-style LLM evaluation, e.g., LMArena, questioning whether draws actually mean equal model ability. TL;DR: simply ignoring draws improves rating systems by 1-3%, and query difficulty/subjectivity relate more strongly to draws than model ratings do.
Raphael Tang tweet media
English
1
2
4
725
Srishti
Srishti@_srishtiyadav·
@Wenyan62 Congrats, Dr. Wenyan! 💚
Indonesia
1
0
1
44
Wenyan Li
Wenyan Li@Wenyan62·
Happy to share that I’ve successfully defended my PhD today 🎉 A big thank you to my committee members Manex aguirrezabal zabaleta, Anna Korhonen, and Charlie Clark ❤️ Very grateful to all the support and encouragement from my supervisor Anders Søgaard and colleagues at Coastal❤️
English
2
0
6
222
Wenyan Li
Wenyan Li@Wenyan62·
@gietema here we could probably phrase it better. A large drop in overlap ratio indicates that the connector is changing neighborhood structure substantially, which may suggest geometric distortion beyond what is required for task alignment.
English
0
0
1
58
Jochem Gietema
Jochem Gietema@gietema·
@Wenyan62 Hi, thanks for this, congrats! Not sure I understand why an optimal connector would maintain the same k-NN sets? Isn't the purpose of the connector to align the embedding for a downstream task, in which case the structure of the original embedding cannot be assumed to be optimal?
Jochem Gietema tweet media
English
2
0
0
125
Wenyan Li
Wenyan Li@Wenyan62·
Happy to share (with a bit of delay tho) our paper on quantifying visual information loss in VLMs --- "Lost in Embeddings: Information Loss in Vision-Language Models" is accepted to EMNLP 2025 findings: arxiv.org/pdf/2509.11986 💃code is also released: github.com/lyan62/vlm-inf…
Wenyan Li tweet media
English
8
36
308
29.9K
Wenyan Li
Wenyan Li@Wenyan62·
@gietema Hi Jochem, thanks for the question. we agree that the purpose of the connector is to align the embeddings spaces. Idea is not that preserving kNN neighborhoods is the end goal, but that overlap ratio provides a way to quantify how much the local structure is perturbed.
English
0
0
0
85
Lei Li
Lei Li@_TobiasLee·
@Wenyan62 Thanks for sharing! Very insightful for understanding the ViT embeddings
English
1
0
1
351
Wenyan Li retweetledi
Afra Amini
Afra Amini@afra_amini·
Current KL estimation practices in RLHF can generate high variance and even negative values! We propose a provably better estimator that only takes a few lines of code to implement.🧵👇 w/ @xtimv and Ryan Cotterell code: arxiv.org/pdf/2504.10637 paper: github.com/rycolab/kl-rb
Afra Amini tweet media
English
4
31
126
15.1K
Wenyan Li retweetledi
Chengzu Li
Chengzu Li@li_chengzu·
Forget just thinking in words. 🚀 New Era of Multimodal Reasoning🚨 🔍 Imagine While Reasoning in Space with MVoT Multimodal Visualization-of-Thought (MVoT) revolutionizes reasoning by generating visual "thoughts" that transform how AI thinks, reasons, and explains itself.
Chengzu Li tweet media
English
15
164
740
78.8K
Wenyan Li
Wenyan Li@Wenyan62·
🎉🎉
ART
0
0
8
545
Wenyan Li
Wenyan Li@Wenyan62·
🍗🍗I will present FoodieQA in person at #EMNLP2024😋😋 Looking forward to meeting old and new friends! Feel free to drop by! (and have some snacks) ⏰ Nov, 13th (Wed) 16:00, In-Person Poster Session E (Riverfront Hall) I'm also on the job market and would be happy to chat :)
Wenyan Li tweet media
English
2
12
63
6.8K
Wenyan Li retweetledi
Xinyu Crystina Zhang
Xinyu Crystina Zhang@crystina_z·
1/7 🚨non-LLM paper alert!🚨 Human's perception of the sentence is quite robust against interchanging words with similar meanings, not even mentioning the semantically equivalent words across different languages. How about the language models? In our recent work, we measure the role of subword-level shared semantics in multilingual LMs. tl;dr: semantically similar subwords could largely *share the same word embedding!* arxiv link: arxiv.org/abs/2411.04530
Xinyu Crystina Zhang tweet media
English
4
18
92
15.3K