Barry Haddow

581 posts

Barry Haddow

@bazril

Researcher in Informatics at University of Edinburgh. Mainly working on machine translation.

Edinburgh, Scotland Katılım Nisan 2010

656 Takip Edilen1.2K Takipçiler

Barry Haddow retweetledi

Weixuan Wang@WeixuanWang66·11 Mar

📣 Excited to share our latest research: "Demystifying Multilingual Chain-of-Thought in Process Reward Modeling" where we explore process reward models beyond English to improve multi-step reasoning in 11 languages! Link: arxiv.org/abs/2502.12663 Code: github.com/weixuan-wang12…

English

901

Barry Haddow retweetledi

HPLT@hplt_eu·17 Mar

New paper on the HPLT v2 dataset making-of: - pipeline documentation and code - extensive analysis of the quality and characteristics - evaluation of the performance of language models and machine translation systems trained on it 🤓Happy reading! arxiv.org/pdf/2503.10267

English

556

Barry Haddow retweetledi

HPLT@hplt_eu·28 Şub

We are happy to announce the second release of HPLT bilingual datasets: - 50 English-centric language pairs = 380M parallel sentences (HPLT) 🤩 - 1,275 non-English-centric language pairs = 16.7B parallel sentences (MultiHPLT) 😮 Available at the HPLT dataset catalogue and OPUS.

English

1.3K

Barry Haddow@bazril·1 Şub

MT Summit 2025 - deadline extended! The deadline for all papers (technical/user/translator/products/projects) has been extended to February 10th. MT Summit will be in Geneva, June 23--27. mtsummit2025.unige.ch/index.html

English

337

Barry Haddow@bazril·26 Oca

EAMT best thesis award - closes on January 31st. Completed an MT-related PhD in 2024? In Europe, Africa or Middle East. Then why not submit your thesis. eamt.org/2024/11/28/the…

English

546

Barry Haddow retweetledi

HPLT@hplt_eu·8 Oca

🥳 Amazing performance of the #HPLT v2 dataset! HuggingFace multilingual evaluation + HPLT English internal evaluation show that HPLT v2 is one of the best datasets to train LLMs. Downloads and more at either HPLT ➡️ hplt-project.org/hplt-v2-datase… or HF ➡️huggingface.co/datasets/HPLT/…

English

1.2K

Barry Haddow@bazril·2 Ara

Very exciting to see the 9B EuroLLM model released - made in Europe and supporting all official EU languages. More and bigger models to come ...

Pedro Martins@PedroHenMartins

Today we release EuroLLM-9B: the best EU-made multilingual LLM of its size! Check the blog post for more info and results: huggingface.co/blog/eurollm-t…. Stay tuned for the technical report and bigger and more powerful models!

English

831

Barry Haddow@bazril·2 Ara

EAMT Best thesis award - now open! Have you defended an MT-related thesis in 2024, in EMEA? Then why not submit to the prestigious EAMT BTA? eamt.org/2024/11/28/the… . Deadline: 2025-01-31

English

530

Barry Haddow retweetledi

HPLT@hplt_eu·2 Ara

Join us on a new edition of the Winter School! "Pretraining Data Quality 🧐 and Multilingual Evaluation of LLMs👀" 🪂Feb. 3–5, 2025, Norway More info and registration: wiki.nlpl.eu/Community/trai… Jointly organised by @hplt_eu and the Nordic Language Processing Laboratory (NLPL)

English

691

Barry Haddow retweetledi

Helsinki-NLP@HelsinkiNLP·20 Kas

The 18th MT marathon will be organized in beautiful Helsinki in the end of August, 2025. We invite you to a week-long gathering of researchers, developers and students with lectures, labs and hacking projects. More information will come - stay tuned!

English

1.5K

Barry Haddow@bazril·18 Kas

*Update:* Deadline for EAMT project grants is extended by 1 week - to November 25th. Details here: eamt.org/2024/10/21/eam…

Barry Haddow@bazril

Only 5 days left to apply for EAMT project grants

English

663

Barry Haddow@bazril·13 Kas

Only 5 days left to apply for EAMT project grants

Barry Haddow@bazril

English

926

Barry Haddow retweetledi

MTSummit2025@MTSummit2025·12 Eyl

📢 𝗠𝗧 𝗦𝘂𝗺𝗺𝗶𝘁 𝟮𝟬𝟮𝟱: Calls for Papers, Workshops and Tutorials 𝗮𝗿𝗲 𝗢𝘂𝘁! You'll find all the details on our website mtsummit2025.unige.ch Deadlines: 📆WS&Tutorials = 25 Nov 2024 📆CfP = 27 Jan 2025 #machinetranslation #users #researchers #translators #AI #LLM

English

1.5K

Barry Haddow@bazril·22 Eki

English

1.4K

Barry Haddow@bazril·2 Eki

New HPLT data release is out!

HPLT@hplt_eu

🚀 INTRODUCING THE LATEST HPLT MONOLINGUAL DATASETS! TL;DR: 🔍 4.5 PB of web crawls 📄 21 billion documents 💝 careful extraction, dedup, annotation and cleaning 💥 193 languages! Explore and download the new HPLT Monolingual Datasets NOW! hplt-project.org/datasets/v2.0 #HPLT

English

354

Barry Haddow retweetledi

Vilém Zouhar@zouharvi·1 Eki

Have you recently used COMET for MT evaluation? ☄️ - Did you report the specific model? ≥12% of papers don't! - Did you report the package version? Makes a difference. - `pip install sacrecomet` generates a nice version+model signature. Not too late for WMT/EMNLP camera-ready!

English

6.6K

Barry Haddow retweetledi

Simon Yu@simon_ycl·27 Eyl

❗Are We Truly Achieving Multilingualism in LLMs or Just Relying on Translation?❗ Need multilingual instruction data and benchmarks? Just translate from English. LLM multilingualism can be easily solved! If you agree, check out our #EMNLP 2024 paper which says this is sub-optimal. arxiv.org/abs/2406.12822 🧵Below

English

10K

Barry Haddow retweetledi

Pedro Martins@PedroHenMartins·25 Eyl

Today we release the first EuroLLM paper and models: EuroLLM-1.7B and EuroLLM-1.7B-Instruct! The EuroLLM project will develop open-weight multilingual LLMs that understand and generate text in all official EU languages. Stay tuned for the bigger and stronger EuroLLMs (9B, 22B)!

English

13.4K

Barry Haddow retweetledi

Vivek Iyer@remorax98·16 Eyl

We know LLMs are poor at MT in low-resource languages (LRLs): curious how to adapt them to perform better? 🚀 Our new paper explores the interplay between scale (of MT data) and diversity (of tasks/langs) in instruction tuning in determining LLM-MT performance for LRLs💡 arxiv.org/abs/2408.12780

English

16.2K

Keşfet

@hplt_eu @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine