Seth Aycock

25 posts

Seth Aycock banner
Seth Aycock

Seth Aycock

@sethjsa

NLP PhD student in Low-resource Translation at @AmsterdamNLP @ltl_uva / Prev @InriaParisNLP / @InfAtEd / @Cambridge_Uni

Amsterdam, The Netherlands Se unió Eylül 2015
596 Siguiendo140 Seguidores
Seth Aycock
Seth Aycock@sethjsa·
MT Marathon this year organised by @HelsinkiNLP was a great week - I presented my research on chain-of-thought for machine translation, worked on a mini-research project, and explored the wonderful city of Helsinki including a few trips to the sauna 🫠
Seth Aycock tweet mediaSeth Aycock tweet media
English
0
0
2
72
Seth Aycock
Seth Aycock@sethjsa·
@SimonHiaubeng @ElliotMurphy91 The principle Maximise Minimal Means is part of one version of minimalist theory. But it's not UG - it's a third factor, domain-general constraint
English
0
0
1
79
Hiáubêng
Hiáubêng@SimonHiaubeng·
@ElliotMurphy91 UG sounds like a minimax principle similar to that in math real analysis. the minimal maximal domain is the place that good works can be begun with.
English
1
0
2
366
Seth Aycock
Seth Aycock@sethjsa·
@Linguist_UR @ElliotMurphy91 Merge, maybe Agree, maybe Labeling. Though I believe there's work ongoing to attribute Merge itself to third factor, domain-general constraints
English
0
0
1
192
The artist formerly known as TH
@ElliotMurphy91 Genuine question: isn’t the SMT that it’s merge (language specific set formation as the basis for hierarchical constituency) and nothing else?
English
1
0
0
428
Seth Aycock
Seth Aycock@sethjsa·
@RaphaelMerx I'm a fan of this paper! We'd expect exactly the same for Kalamang (if we could collect an OOD test set). In the appendix we show too that the 100-example test set consists of short, easy sentences so a ChrF++ of ~30 is really not that proficient
English
0
0
1
57
Seth Aycock retuiteado
Di Wu
Di Wu@diwuNLP·
We show that a grammar book provides little or even no help for translation in LLMs, questioning the recent "truly zero-shot translation" --- no data no gain, still 🧐
Seth Aycock@sethjsa

Our work “Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?” is now on arXiv! arxiv.org/abs/2409.19151 - in collaboration with @davidstap, @diwuNLP, @c_monz , and Khalil Sima'an from @illc_amsterdam and @ltl_uva 🧵

English
0
1
8
673
Seth Aycock
Seth Aycock@sethjsa·
@JeffDean (Plus, Kalamang parallel data has been online since November 2020!)
Filipino
0
0
1
23
Seth Aycock
Seth Aycock@sethjsa·
@JeffDean x.com/sethjsa/status… Actually we find LLMs learn most/all translation ability from parallel sentences in the book, not the grammar. And we can predict translation performance just from prompts' test set vocab coverage! But we do find that grammar can help *linguistic* tasks
Seth Aycock@sethjsa

Our work “Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?” is now on arXiv! arxiv.org/abs/2409.19151 - in collaboration with @davidstap, @diwuNLP, @c_monz , and Khalil Sima'an from @illc_amsterdam and @ltl_uva 🧵

English
1
1
1
218
Jeff Dean
Jeff Dean@JeffDean·
Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long context capabilities, supporting millions of tokens of multimodal input. The multimodal capabilities of the model means you can interact in sophisticated ways with entire books, very long document collections, codebases of hundreds of thousands of lines across hundreds of files, full movies, entire podcast series, and more. Gemini 1.5 was built by an amazing team of people from @GoogleDeepMind, @GoogleResearch, and elsewhere at @Google. @OriolVinyals (my co-technical lead for the project) and I are incredibly proud of the whole team, and we’re so excited to be sharing this work and what long context and in-context learning can mean for you today! There’s lots of material about this, some of which are linked to below. Main blog post: blog.google/technology/ai/… Technical report: “Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context” goo.gle/GeminiV1-5 Videos of interactions with the model that highlight its long context abilities: Understanding the three.js codebase: youtube.com/watch?v=SSnsmq… Analyzing a 45 minute Buster Keaton movie: youtube.com/watch?v=wa0MT8… Apollo 11 transcript interaction: youtube.com/watch?v=LHKL_2… Starting today, we’re offering a limited preview of 1.5 Pro to developers and enterprise customers via AI Studio and Vertex AI. Read more about this on these blogs: Google for Developers blog: developers.googleblog.com/2024/02/gemini… Google Cloud blog: cloud.google.com/blog/products/… We’ll also introduce 1.5 Pro with a standard 128,000 token context window when the model is ready for a wider release. Coming soon, we plan to introduce pricing tiers that start at the standard 128,000 context window and scale up to 1 million tokens, as we improve the model. Early testers can try the 1 million token context window at no cost during the testing period. We’re excited to see what developer’s creativity unlocks with a very long context window. Let me walk you through the capabilities of the model and what I’m excited about!
YouTube video
YouTube
YouTube video
YouTube
YouTube video
YouTube
Jeff Dean tweet media
English
184
1.1K
6K
1.7M
Seth Aycock
Seth Aycock@sethjsa·
@jxmnop x.com/sethjsa/status… It turns out LLMs learn most or all translation ability from parallel sentences in the book, not the grammar. And fine-tuning a small translation model matches or beats long-context LLM results! (plus Kalamang parallel data has been online since Nov 2020)
Seth Aycock@sethjsa

Our work “Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?” is now on arXiv! arxiv.org/abs/2409.19151 - in collaboration with @davidstap, @diwuNLP, @c_monz , and Khalil Sima'an from @illc_amsterdam and @ltl_uva 🧵

English
0
0
3
71
dr. jack morris
dr. jack morris@jxmnop·
recently read one of the most interesting LLM papers i've ever read, the story goes something like this > dutch PhD student/researcher Eline Visser lives on remote island in Indonesia for several years > learns the Kalamang language, an oral language with only 100 native speakers > she writes "The Grammar of Kalamang", a textbook on how to write in Kalamang > since Kalamang is a spoken language only, TGOK is the only text on earth written in it > so, there is no internet data in written Kalamang > so, language models haven't read any Kalamang during training > in the paper, researchers explore how to teach a language model a new language from a single book > they evaluate various types of fine-tuning and prompting > much to my chagrin, prompting wins (and it's not close) > larger models & longer context windows help a lot by the way, seems like humans still win at this task (for now)
English
33
201
1.8K
290.7K
Seth Aycock
Seth Aycock@sethjsa·
More generally, we suggest that data collection efforts for multilingual XLR tasks like translation are best focused on parallel data over linguistic description, given the advantages in computational cost, token efficiency, availability!
English
0
0
6
332
Seth Aycock
Seth Aycock@sethjsa·
Our results emphasise the importance of task-appropriate data for XLR languages: parallel data for translation, and grammatical data for linguistic tasks.
English
1
0
4
345