Huda Khayrallah
842 posts

Huda Khayrallah
@HudaKhay
Machine Translation/#NLProc/ML Researcher at Microsoft. Past: @UCBerkeley CS ugrad; @LiltHQ research intern; @jhuCLSP/@jhuCompSci PhD

Lots of work on cross-lingual alignment encourages multilingual LLMs to generalize knowledge across languages. But this push for uniformity creates a tension: what happens to knowledge that should remain local? We look into this trade-off of transfer and cultural erasure:🧵

Multilingual models are usually heavily skewed in favor of high-resource languages. We change this with X-ALMA: an LLM-based translator committed to ensuring top-tier performance across 50 diverse languages, regardless of their resource levels! Paper: arxiv.org/pdf/2410.03115

🧐Which languages benefit the most from vocabulary adaptation? We introduce VocADT, a new vocabulary adaptation method using a vocabulary adapter, and explore the impact of various adaptation strategies on languages with diverse scripts and fragmentation to answer this question.






I’m super thrilled to have won the AMTA Best Thesis Award!! A huge thanks to the AMTA organizers for this recognition ☺️ See you all in Chicago amtaweb.org



I’m super thrilled to have won the AMTA Best Thesis Award!! A huge thanks to the AMTA organizers for this recognition ☺️ See you all in Chicago amtaweb.org


1⃣Meta’s Cicero by Bakhtin et al. was the talk of town when it was released in November 2022. Even main journals published that Cicero had achieved human-level Diplomacy in strategy and negotiation. Well, had it!? Our paper had this answer: arxiv.org/pdf/2406.04643

🗣️XLAVS-R is accepted at #ACL2024 main! 🚀🚀 We present XLAVS-R, a cross-lingual audio-visual model for noise-robust speech perception in over 100 languages. Very happy to present our work done during my @AIatMeta internship with @ChanghanWang. arxiv.org/abs/2403.14402

Had a wonderful time presenting my paper from my internship last year with @CuraiHQ at #NAACL2024! Grateful for the opportunity to talk to the awesome and thoughtful people in the NLP community. @elliotschu @anithakan @nairvarun18

It's not the first time! A dream team of @enfleisig (human eval expert), Adam Lopez (remembers the Stat MT era), @kchonyc (helped end it), and me (pun in title) are here to teach you the history of scale crises and what lessons we can take from them. 🧵arxiv.org/abs/2311.05020

Today in @ICB_journal - @armanafzadeh, Janneke Schwaner, and I co-led a brief article on Strategies for Organizing Interdisciplinary Events. @SICB_ @SICB_DCB_DVM If you are interested in hosting an interdisciplinary event, we hope it is helpful: doi.org/10.1093/icb/ic…

🏆 Thrilled to share the launch of the AMTA Best Thesis Award, which aims to highlight the achievements of a recent PhD graduate at an institution in the Americas whose thesis has focused on topics related to machine translation. [1/2]



