LDC

804 posts

LDC banner
LDC

LDC

@LDCupenn

The leading non-profit consortium that creates, shares and preserves high quality language resources

Philadelphia, PA Katılım Kasım 2014
0 Takip Edilen372 Takipçiler
LDC
LDC@LDCupenn·
More LDC data in the LORELEI series: LORELEI Somali Representative Language Pack features monolingual and parallel text, annotations, software tools and more for human language technology development to address emergent situations bit.ly/4cAwmNc
English
0
0
0
34
LDC
LDC@LDCupenn·
MATERIAL Tagalog-English Language Pack has 100 hours of Tagalog conversational telephone speech, transcripts, English translations, annotations and queries designed to support cross language information retrieval bit.ly/4tabfHI
English
0
0
0
53
LDC
LDC@LDCupenn·
DEFT Chinese and English Light and Rich ERE Parallel Annotation: 179 Chinese-English discussion forum documents labeled for entities, relations and events, including coreference (light) and event hoppers (rich), developed by LDC for the DARPA DEFT program bit.ly/4edPl1m
English
0
0
0
35
LDC
LDC@LDCupenn·
Check out our April newsletter for LDC’s latest publications – DEFT Chinese and English Light and Rich ERE Parallel Annotation, MATERIAL Tagalog-English Language Pack and LORELEI Somali Representative Language Pack ldc-upenn.blogspot.com
English
0
0
0
79
LDC
LDC@LDCupenn·
CALLHOME Spanish Lexicon Second Edition: morphological, phonological, stress & frequency info for 45,547 Spanish words from transcripts of telephone speech between Spanish speakers and Spanish news text, with a pronunciation dictionary & G2P tools bit.ly/4sEIrX0
English
0
0
0
47
LDC
LDC@LDCupenn·
CALLHOME Spanish Second Edition brings original speech and transcript datasets up to date with new transcripts and revised directories, file formats and documentation bit.ly/4dfSehI
English
0
0
0
26
LDC
LDC@LDCupenn·
Ancient Chinese WordNet contains 55,100 records of words from the Pre-Qin period (before 221 BCE) linked to a corresponding synset in Princeton WordNet 1.6, covering 22 noun categories, 15 verb categories, and additional adjective and adverb categories bit.ly/3NybVa2
English
0
0
0
27
LDC
LDC@LDCupenn·
LDC’s March newsletter features the release of three new publications – Ancient Chinese WordNet, CALLHOME Spanish Second Edition and CALLHOME Spanish Lexicon Second Edition ldc-upenn.blogspot.com
English
0
0
0
30
LDC
LDC@LDCupenn·
The 50th Penn Linguistics Conference (PLC) is happening Feb 28–Mar 1. PLC brings together students, faculty & researchers interested in languages & linguistics to share new work and connect with peers. We wish everyone a great and productive conference. sites.google.com/sas.upenn.edu/…
LDC tweet media
English
0
1
2
71
LDC
LDC@LDCupenn·
More LDC data in the LORELEI series: LORELEI Russian Representative Language Pack features monolingual and parallel text, annotations, software tools and more for human language technology development to address emergent situations bit.ly/3MHDr4v
English
0
0
0
39
LDC
LDC@LDCupenn·
Happy International #MotherLanguageDay This year’s theme celebrates youth voices on multilingual education – emphasizing that language is central to identity, learning, well-being and participation in society. Let’s celebrate every language, every voice unesco.org/en/days/mother…
English
0
0
1
30
LDC
LDC@LDCupenn·
KAIROS Schema Learning Background Source Data: 14K English & Spanish multimodal resources collected by LDC for a Schema Learning Corpus; schemas were used with event extraction to characterize & make predictions about real-world events in the corpus bit.ly/4tPVeYa
English
0
0
0
43
LDC
LDC@LDCupenn·
2022 NIST Language Recognition Evaluation Test and Development Sets: 222 hours of telephone speech and broadcast narrowband speech in 14 languages, plus turnkey evaluation documentation, emphasizing African languages and related English and French dialects bit.ly/4rIEJLs
English
0
0
0
35
LDC
LDC@LDCupenn·
Catch up on 2026 membership discounts, spring data scholarship awards and the release of three new publications in LDC’s February newsletter ldc-upenn.blogspot.com
English
0
0
0
42
LDC
LDC@LDCupenn·
MATERIAL Swahili-English Language Pack has 112 hours of Swahili conversational telephone speech, transcripts, English translations, annotations and queries designed to support cross language information retrieval bit.ly/49SWG3R
English
0
0
0
38
LDC
LDC@LDCupenn·
CALLHOME Japanese Lexicon Second Edition: morphological, phonological and stress information for 80,688 Japanese words from transcripts of telephone conversations between native Japanese speakers, along with a pronunciation dictionary and G2P tools bit.ly/3NlxvhC
English
0
0
0
36
LDC
LDC@LDCupenn·
CALLHOME Japanese Second Edition brings original speech and transcript datasets up to date with new transcripts and revised directories, file formats and documentation bit.ly/49kSdqz
English
0
0
0
28
LDC
LDC@LDCupenn·
LDC welcomes 2026 with its January newsletter featuring three publications and membership renewal information ldc-upenn.blogspot.com
English
0
0
0
35
LDC
LDC@LDCupenn·
LORELEI Sinhala Incident Language Pack: monolingual and parallel text, annotations, software tools and more for human language technology development in this under-resourced language bit.ly/4iVnJP1
English
0
0
0
38
LDC
LDC@LDCupenn·
2021 NIST SRE Test Set: 447 hours of Cantonese, Mandarin, and English conversational telephone speech, audio from video, and selfie image data for development and test, along with answer keys, enrollment, trial files and documentation bit.ly/4q35JV4
English
0
0
1
40