Bryan Li

35 posts

Bryan Li

Bryan Li

@bryanlics

CS PhD student @penn, quantifying & improving the multilingual knowledge of LLMs 🌐📚 BA & MS @columbia

Katılım Mayıs 2020
216 Takip Edilen159 Takipçiler
Jason Weston
Jason Weston@jaseweston·
🌿Introducing MetaCLIP 2 🌿 📝: arxiv.org/abs/2507.22062 code, model: github.com/facebookresear… After four years of advancements in English-centric CLIP development, MetaCLIP 2 is now taking the next step: scaling CLIP to worldwide data. The effort addresses long-standing challenges: (1) large-scale non-English data curation pipelines are largely undeveloped, and (2) the curse of multilinguality, where English performance often degrades in multilingual CLIP compared to English-only CLIP. With a complete recipe for worldwide CLIP—spanning data curation, modeling, and training—we show that English and non-English worlds can mutually benefit and elevate each other, achieving SoTA multilingual performance. Join the Meta booth at #ACL2025 to learn more. (1/3)
Jason Weston tweet media
English
14
67
340
59.9K
Bryan Li
Bryan Li@bryanlics·
I'm in Vienna this week to present our poster on the robustness of RAG systems to multilingual contexts at #ACL2025NLP! 🗓️ Poster Session | Wednesday, July 30, 16:00 - 17:30 📍 Hall 4/5 @aclmeeting
English
0
0
1
131
Mingyang Wang
Mingyang Wang@mingyang2666·
I'll be at @aclmeeting next week to present this paper! 🗓️ Poster Session | Wednesday, July 30, 11:00–12:30 📍 Hall 4/5 Happy to grab a coffee and chat! ☕
Mingyang Wang@mingyang2666

🎉Excited to share our paper on cross-lingual inconsistency is accepted to #ACL2025 🇦🇹! We dissect why LLMs produce inconsistent outputs across languages using interpretability analysis, and propose a simple shortcut-based fix, evaluated on 17 languages. arxiv.org/abs/2504.04264

English
1
0
20
1K
Bryan Li
Bryan Li@bryanlics·
This is the final paper of my PhD! Thanks to my many @upennnlp collaborators: @samarhdr, Chris, and the 7 wonderful students who I was fortunate to mentor. Please look out for our poster at ACL 2025 in Vienna. 4/4 🧵
English
0
0
3
118
Bryan Li
Bryan Li@bryanlics·
We study cross-lingual robustness over 4 LLMs and 2 IR models. We find A) multilingual RAG performs best; B) LLM’s citations varies widely across langs. Our further experiments investigate aspects of cross-lingual RAG from IR to LLM explanations. 3/4 🧵
Bryan Li tweet mediaBryan Li tweet media
English
1
0
0
111
Bryan Li
Bryan Li@bryanlics·
@yong_zhengxin Really thorough work on multilingual reasoning! A quick self-promotion of our xSTREET dataset arxiv.org/abs/2403.02567… (ACL 2024), which has annotations for the intermediate reasoning steps for STEM problems.
English
2
2
6
307
Yong Zheng-Xin
Yong Zheng-Xin@yong_zhengxin·
📣 New paper! We observe that reasoning language models finetuned only on English data are capable of zero-shot cross-lingual reasoning through a "quote-and-think" pattern. However, this does not mean they reason the same way across all languages or in new domains. [1/N]
Yong Zheng-Xin tweet media
English
5
42
181
42.1K
Bryan Li
Bryan Li@bryanlics·
@mykocyigit Congrats! Data contamination is v relevant these days with bigger and bigger training corpora
English
0
0
1
82
Bryan Li retweetledi
Bowen Jiang (Lauren)
Bowen Jiang (Lauren)@laurenbjiang·
🚀 How well can LLMs know you and personalize your response? Turns out, not so much! Introducing the PersonaMem Benchmark -- 👩🏻‍💻Evaluate LLM's ability to understand evolving persona from 180+ multi-session user-chatbot conversation history 🎯Latest models (GPT-4.1, GPT-4.5, o4-mini, Llama-4, Gemini 2.0, Deepseek-R1, Claude-3.7) all struggle in personalization! 🎨7 personalization skills tested in 15 scenarios 🌟Realistic long-context evaluation up to 1M tokens 👇 Check out what we discovered… (1/6)
Bowen Jiang (Lauren) tweet mediaBowen Jiang (Lauren) tweet media
English
3
10
33
4.5K
Bryan Li
Bryan Li@bryanlics·
TL;DR - translation pairs > bilingual terminologies, generation especially boosts translations for small LLMs Our ablations highlight the need for more challenging domain-adapted MT datasets with modern LLMs. Thanks to collaborators Jiaming, @ebriakou & @ColinCherry!
English
0
0
0
85
Bryan Li
Bryan Li@bryanlics·
Externally retrieving knowledge empowers LLMs for domain-adapted MT ⚖️🩺. But how is knowledge best represented, and how viable is generating it from an LLM itself? Our @GoogleAI paper investigates these questions through a careful experimental setup 📜. arxiv.org/abs/2503.05010
English
1
2
6
442
Bryan Li
Bryan Li@bryanlics·
@_reachsumit Great work! Nice to see a pipeline approach to multilingual QA generation in 2025. Reminds me of our EMNLP 2023 work arxiv.org/abs/2304.12206 (my last paper without LLMs 😅)
English
0
0
0
132
Sumit
Sumit@_reachsumit·
Few-Shot Multilingual Open-Domain QA from 5 Examples Leverages large-scale self-supervised pre-training using WikiData followed by fine-tuning on LLM-generated synthetic data from just 5 examples per language, outperforming existing few-shot baselines. 📝arxiv.org/abs/2502.19722
English
1
2
3
611
Bryan Li retweetledi
Shreya Havaldar
Shreya Havaldar@shreyahavaldar·
🚨 LLMs must grasp implied language to reason about emotions, social cues, etc. Our @GoogleDeepMind paper presents the Implied NLI dataset. Targeting social norms 🌎 and conversational dynamics 💬, we enhance LLM understanding of real-world implication! arxiv.org/abs/2501.07719
English
1
15
55
6K
Bryan Li
Bryan Li@bryanlics·
We'll be presenting this at the NLP for Wikipedia workshop @emnlpmeeting. This is ongoing work, and we'd love to hear feedback from the community! A shout-out to my collaborators Fiona and Adwait for their amazing first paper efforts, @samarhdr, and Chris. 4/4 🧵
English
0
0
0
123
Bryan Li
Bryan Li@bryanlics·
Using cross-lingually aligned queries, we analyze responses in a RAG setting. Responses can be "flipped" by varying passages' linguistic composition. We thus find these systems to be far from cross-lingually robust, as certain viewpoints can be amplified over others. 3/4 🧵
English
1
0
0
139
Bryan Li
Bryan Li@bryanlics·
RAG enables LLMs to access external info 📖. But when this info is multiple languages 🌐, can LLMs reconcile differing viewpoints 🧐? We introduce BordIRlines, a dataset to study the robustness of cross-lingual RAG. 📃arxiv.org/abs/2410.01171 🗃️ huggingface.co/datasets/borde… 1/4 🧵
Bryan Li tweet media
English
1
3
8
783