LDC

800 posts

LDC banner
LDC

LDC

@LDCupenn

The leading non-profit consortium that creates, shares and preserves high quality language resources

Philadelphia, PA Katılım Kasım 2014
0 Takip Edilen373 Takipçiler
LDC
LDC@LDCupenn·
CALLHOME Spanish Lexicon Second Edition: morphological, phonological, stress & frequency info for 45,547 Spanish words from transcripts of telephone speech between Spanish speakers and Spanish news text, with a pronunciation dictionary & G2P tools bit.ly/4sEIrX0
English
0
0
0
35
LDC
LDC@LDCupenn·
CALLHOME Spanish Second Edition brings original speech and transcript datasets up to date with new transcripts and revised directories, file formats and documentation bit.ly/4dfSehI
English
0
0
0
18
LDC
LDC@LDCupenn·
Ancient Chinese WordNet contains 55,100 records of words from the Pre-Qin period (before 221 BCE) linked to a corresponding synset in Princeton WordNet 1.6, covering 22 noun categories, 15 verb categories, and additional adjective and adverb categories bit.ly/3NybVa2
English
0
0
0
21
LDC
LDC@LDCupenn·
LDC’s March newsletter features the release of three new publications – Ancient Chinese WordNet, CALLHOME Spanish Second Edition and CALLHOME Spanish Lexicon Second Edition ldc-upenn.blogspot.com
English
0
0
0
24
LDC
LDC@LDCupenn·
The 50th Penn Linguistics Conference (PLC) is happening Feb 28–Mar 1. PLC brings together students, faculty & researchers interested in languages & linguistics to share new work and connect with peers. We wish everyone a great and productive conference. sites.google.com/sas.upenn.edu/…
LDC tweet media
English
0
1
2
69
LDC
LDC@LDCupenn·
More LDC data in the LORELEI series: LORELEI Russian Representative Language Pack features monolingual and parallel text, annotations, software tools and more for human language technology development to address emergent situations bit.ly/3MHDr4v
English
0
0
0
37
LDC
LDC@LDCupenn·
Happy International #MotherLanguageDay This year’s theme celebrates youth voices on multilingual education – emphasizing that language is central to identity, learning, well-being and participation in society. Let’s celebrate every language, every voice unesco.org/en/days/mother…
English
0
0
1
28
LDC
LDC@LDCupenn·
KAIROS Schema Learning Background Source Data: 14K English & Spanish multimodal resources collected by LDC for a Schema Learning Corpus; schemas were used with event extraction to characterize & make predictions about real-world events in the corpus bit.ly/4tPVeYa
English
0
0
0
38
LDC
LDC@LDCupenn·
2022 NIST Language Recognition Evaluation Test and Development Sets: 222 hours of telephone speech and broadcast narrowband speech in 14 languages, plus turnkey evaluation documentation, emphasizing African languages and related English and French dialects bit.ly/4rIEJLs
English
0
0
0
33
LDC
LDC@LDCupenn·
Catch up on 2026 membership discounts, spring data scholarship awards and the release of three new publications in LDC’s February newsletter ldc-upenn.blogspot.com
English
0
0
0
40
LDC
LDC@LDCupenn·
MATERIAL Swahili-English Language Pack has 112 hours of Swahili conversational telephone speech, transcripts, English translations, annotations and queries designed to support cross language information retrieval bit.ly/49SWG3R
English
0
0
0
36
LDC
LDC@LDCupenn·
CALLHOME Japanese Lexicon Second Edition: morphological, phonological and stress information for 80,688 Japanese words from transcripts of telephone conversations between native Japanese speakers, along with a pronunciation dictionary and G2P tools bit.ly/3NlxvhC
English
0
0
0
34
LDC
LDC@LDCupenn·
CALLHOME Japanese Second Edition brings original speech and transcript datasets up to date with new transcripts and revised directories, file formats and documentation bit.ly/49kSdqz
English
0
0
0
26
LDC
LDC@LDCupenn·
LDC welcomes 2026 with its January newsletter featuring three publications and membership renewal information ldc-upenn.blogspot.com
English
0
0
0
33
LDC
LDC@LDCupenn·
LORELEI Sinhala Incident Language Pack: monolingual and parallel text, annotations, software tools and more for human language technology development in this under-resourced language bit.ly/4iVnJP1
English
0
0
0
36
LDC
LDC@LDCupenn·
2021 NIST SRE Test Set: 447 hours of Cantonese, Mandarin, and English conversational telephone speech, audio from video, and selfie image data for development and test, along with answer keys, enrollment, trial files and documentation bit.ly/4q35JV4
English
0
0
1
37
LDC
LDC@LDCupenn·
Check out LDC’s December’s newsletter for the latest news and publications and join us in celebrating the release of our 1000th corpus! ldc-upenn.blogspot.com
English
0
0
0
26
LDC
LDC@LDCupenn·
Check out ISCA-SAC's Speech Pitch podcast to hear from LDC’s Denise DiPersio #18.9. This session was recorded during Interspeech 2025. Listen to Denise talk about LDC’s past, present and future and LDC’s involvement in Interspeech since the 2009 conference in Brighton.
ISCA-SAC@iscaSAC

Interspeech Impressions 2025 from Speech Pitch podcast by ISCA-SAC continues with EIGHT MORE mini episodes from the heart of Interspeech. Listen now on YouTube @SpeechPitch" target="_blank" rel="nofollow noopener">youtube.com/@SpeechPitch and Spotify open.spotify.com/show/6buZ7O7qa…

English
0
0
0
84
LDC
LDC@LDCupenn·
LORELEI Ilocano Incident Language Pack: monolingual and parallel text, annotations, software tools and more for human language technology development in this under-resourced language bit.ly/43moVEw
English
0
0
0
34
LDC
LDC@LDCupenn·
AnnoDIFP CTS Audio and Transcripts: 242.52 hours of English telephone audio and transcripts from 1179 calls involving 327 participants, paired with scores from two self-reported personality assessments bit.ly/47J6JHX
English
0
0
0
40