Improving recognition of proper nouns (in ASR) through generation and filtering of phonetic transcriptions

Abstract : Accurate phonetic transcription of proper nouns can be an important resource for commercial applications that embed speech technologies, such as audio indexing and vocal phone directory lookup. However, an accurate phonetic transcription is more difficult to obtain for proper nouns than for regular words. Indeed, phonetic transcription of a proper noun depends on both the origin of the speaker pronouncing it and the origin of the proper noun itself. This work proposes a method that allows the extraction of phonetic transcriptions of proper nouns using actual utterances of those proper nouns, thus yielding transcriptions based on practical use instead of mere pronunciation rules. The proposed method consists in a process that first extracts phonetic transcriptions, and then iteratively filters them. In order to initialize the process, an alignment dictionary is used to detect word boundaries. A rule-based grapheme-to-phoneme generator (LIA_PHON), a knowledge-based approach (JSM), and a Statistical Machine Translation based system were evaluated for this alignment. As a result, compared to our reference dictionary (BDLEX supplemented by LIA_PHON for missing words) on the ESTER 1 French broadcast news corpus, we were able to significantly decrease the Word Error Rate (WER) on segments of speech with proper nouns, without negatively affecting the WER on the rest of the corpus.
Type de document :
Article dans une revue
Computer Speech and Language, Elsevier, 2014, 28 (4), pp.979-996. 〈10.1016/j.csl.2014.02.006〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01433238
Contributeur : Sylvain Meignier <>
Soumis le : mercredi 22 mars 2017 - 17:18:49
Dernière modification le : mardi 28 mars 2017 - 01:05:28
Document(s) archivé(s) le : vendredi 23 juin 2017 - 12:27:55

Fichier

CSL_antoine.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Antoine Laurent, Sylvain Meignier, Paul Deléglise. Improving recognition of proper nouns (in ASR) through generation and filtering of phonetic transcriptions. Computer Speech and Language, Elsevier, 2014, 28 (4), pp.979-996. 〈10.1016/j.csl.2014.02.006〉. 〈hal-01433238〉

Partager

Métriques

Consultations de la notice

121

Téléchargements de fichiers

147