Improving recognition of proper nouns (in ASR) through generation and filtering of phonetic transcriptions

Antoine Laurent; Sylvain Meignier; Paul Deléglise

doi:10.1016/j.csl.2014.02.006

Article Dans Une Revue Computer Speech and Language Année : 2014

Improving recognition of proper nouns (in ASR) through generation and filtering of phonetic transcriptions

(1) , (1) , (1)

Antoine Laurent

Fonction : Auteur
PersonId : 13586
IdHAL : antoine-laurent
ORCID : 0000-0002-2653-1008
IdRef : 147099072

Laboratoire d'Informatique de l'Université du Mans

Sylvain Meignier

Fonction : Auteur
PersonId : 11674
IdHAL : sylvain-meignier
ORCID : 0000-0001-7687-073X
IdRef : 182269086

Laboratoire d'Informatique de l'Université du Mans

Paul Deléglise

Fonction : Auteur
PersonId : 998324

Laboratoire d'Informatique de l'Université du Mans

Résumé

Accurate phonetic transcription of proper nouns can be an important resource for commercial applications that embed speech technologies, such as audio indexing and vocal phone directory lookup. However, an accurate phonetic transcription is more difficult to obtain for proper nouns than for regular words. Indeed, phonetic transcription of a proper noun depends on both the origin of the speaker pronouncing it and the origin of the proper noun itself. This work proposes a method that allows the extraction of phonetic transcriptions of proper nouns using actual utterances of those proper nouns, thus yielding transcriptions based on practical use instead of mere pronunciation rules. The proposed method consists in a process that first extracts phonetic transcriptions, and then iteratively filters them. In order to initialize the process, an alignment dictionary is used to detect word boundaries. A rule-based grapheme-to-phoneme generator (LIA_PHON), a knowledge-based approach (JSM), and a Statistical Machine Translation based system were evaluated for this alignment. As a result, compared to our reference dictionary (BDLEX supplemented by LIA_PHON for missing words) on the ESTER 1 French broadcast news corpus, we were able to significantly decrease the Word Error Rate (WER) on segments of speech with proper nouns, without negatively affecting the WER on the rest of the corpus.

Mots clés

Speech recognition SMT Phonetic transcription Proper nouns Moses G2P

Domaines

Informatique et langage [cs.CL]

Fichier principal

CSL_antoine.pdf (2.09 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

sylvain meignier : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01433238

Soumis le : mercredi 22 mars 2017-17:18:49

Dernière modification le : mardi 28 mars 2017-01:05:28

Archivage à long terme le : vendredi 23 juin 2017-12:27:55

Dates et versions

hal-01433238 , version 1 (22-03-2017)

Identifiants

HAL Id : hal-01433238 , version 1
DOI : 10.1016/j.csl.2014.02.006

Citer

Antoine Laurent, Sylvain Meignier, Paul Deléglise. Improving recognition of proper nouns (in ASR) through generation and filtering of phonetic transcriptions. Computer Speech and Language, 2014, 28 (4), pp.979-996. ⟨10.1016/j.csl.2014.02.006⟩. ⟨hal-01433238⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LEMANS LIUM LIUM-LST

141 Consultations

403 Téléchargements

Improving recognition of proper nouns (in ASR) through generation and filtering of phonetic transcriptions

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager