LDC

812 posts

LDC banner
LDC

LDC

@LDCupenn

The leading non-profit consortium that creates, shares and preserves high quality language resources

Philadelphia, PA เข้าร่วม Kasım 2014
0 กำลังติดตาม373 ผู้ติดตาม
LDC
LDC@LDCupenn·
LORELEI Multiway Translated Text: a fixed set of English texts translated into 24 languages and used in the representative language packs developed by LDC for the DARPA LORELEI program; source data was newswire, a phrasebook and an elicitation corpus bit.ly/4ooyQ5B
LDC tweet media
English
1
0
0
30
LDC
LDC@LDCupenn·
Multi-Language Conversational Telephone Speech 2014 – Spanish & Portuguese: 123 hours of Brazilian Portuguese, Caribbean Spanish, European Spanish and Latin American Spanish recordings for automatic language identification research & technology evaluation bit.ly/4vGujhn
LDC tweet media
English
0
0
0
32
LDC
LDC@LDCupenn·
KAIROS Phase 1 Evaluation Source Data, Annotation & Assessment: English & Spanish web data, annotations, knowledge graphs, system output assessment & human assessment results for 10 complex events, developed by LDC to support the DARPA KAIROS program bit.ly/43qUWej
LDC tweet media
English
0
0
0
44
LDC
LDC@LDCupenn·
LDC’s June newsletter contains important announcements for LDC organization account administrators and data licensees as well as details on three new publications ldc-upenn.blogspot.com
LDC tweet media
English
0
0
0
23
LDC
LDC@LDCupenn·
CALLHOME German Lexicon Second Edition: morphological, phonological, stress and frequency information for 318,809 German words from CELEX2 and transcripts of telephone speech between German speakers, with a pronunciation dictionary bit.ly/4wiWZhP
English
0
0
0
35
LDC
LDC@LDCupenn·
CALLHOME German Second Edition brings original speech and transcript datasets up to date with new transcripts and revised directories, file formats and documentation bit.ly/4fdVzib
English
0
0
0
30
LDC
LDC@LDCupenn·
MADCAT Phases 1-3 Composite Evaluation Set: 1643 Arabic images w/ annotations derived from web text and newswire documents copied by hand, scanned, annotated and translated for automatic conversion of foreign language text images into English transcripts bit.ly/4dhqeJ3
English
0
0
0
31
LDC
LDC@LDCupenn·
LDC’s May newsletter announces three new publications: MADCAT Phases 1-3 Composite Evaluation Set, CALLHOME German Second Edition and CALLHOME German Lexicon Second Edition ldc-upenn.blogspot.com
English
0
0
0
23
LDC
LDC@LDCupenn·
More LDC data in the LORELEI series: LORELEI Somali Representative Language Pack features monolingual and parallel text, annotations, software tools and more for human language technology development to address emergent situations bit.ly/4cAwmNc
English
0
0
0
53
LDC
LDC@LDCupenn·
MATERIAL Tagalog-English Language Pack has 100 hours of Tagalog conversational telephone speech, transcripts, English translations, annotations and queries designed to support cross language information retrieval bit.ly/4tabfHI
English
0
0
0
68
LDC
LDC@LDCupenn·
DEFT Chinese and English Light and Rich ERE Parallel Annotation: 179 Chinese-English discussion forum documents labeled for entities, relations and events, including coreference (light) and event hoppers (rich), developed by LDC for the DARPA DEFT program bit.ly/4edPl1m
English
0
0
0
48
LDC
LDC@LDCupenn·
Check out our April newsletter for LDC’s latest publications – DEFT Chinese and English Light and Rich ERE Parallel Annotation, MATERIAL Tagalog-English Language Pack and LORELEI Somali Representative Language Pack ldc-upenn.blogspot.com
English
0
0
0
89
LDC
LDC@LDCupenn·
CALLHOME Spanish Lexicon Second Edition: morphological, phonological, stress & frequency info for 45,547 Spanish words from transcripts of telephone speech between Spanish speakers and Spanish news text, with a pronunciation dictionary & G2P tools bit.ly/4sEIrX0
English
0
0
0
53
LDC
LDC@LDCupenn·
CALLHOME Spanish Second Edition brings original speech and transcript datasets up to date with new transcripts and revised directories, file formats and documentation bit.ly/4dfSehI
English
0
0
0
29
LDC
LDC@LDCupenn·
Ancient Chinese WordNet contains 55,100 records of words from the Pre-Qin period (before 221 BCE) linked to a corresponding synset in Princeton WordNet 1.6, covering 22 noun categories, 15 verb categories, and additional adjective and adverb categories bit.ly/3NybVa2
English
0
0
0
30
LDC
LDC@LDCupenn·
LDC’s March newsletter features the release of three new publications – Ancient Chinese WordNet, CALLHOME Spanish Second Edition and CALLHOME Spanish Lexicon Second Edition ldc-upenn.blogspot.com
English
0
0
0
33
LDC
LDC@LDCupenn·
The 50th Penn Linguistics Conference (PLC) is happening Feb 28–Mar 1. PLC brings together students, faculty & researchers interested in languages & linguistics to share new work and connect with peers. We wish everyone a great and productive conference. sites.google.com/sas.upenn.edu/…
LDC tweet media
English
0
1
2
77
LDC
LDC@LDCupenn·
More LDC data in the LORELEI series: LORELEI Russian Representative Language Pack features monolingual and parallel text, annotations, software tools and more for human language technology development to address emergent situations bit.ly/3MHDr4v
English
0
0
0
39
LDC
LDC@LDCupenn·
Happy International #MotherLanguageDay This year’s theme celebrates youth voices on multilingual education – emphasizing that language is central to identity, learning, well-being and participation in society. Let’s celebrate every language, every voice unesco.org/en/days/mother…
English
0
0
1
30
LDC
LDC@LDCupenn·
KAIROS Schema Learning Background Source Data: 14K English & Spanish multimodal resources collected by LDC for a Schema Learning Corpus; schemas were used with event extraction to characterize & make predictions about real-world events in the corpus bit.ly/4tPVeYa
English
0
0
0
43