Enriching confusion networks for post-processing

Abstract : The paper proposes a new approach for a posteriori enrichment of automatic speech recognition (ASR) confusion networks (CNs). CNs are usually needed to decrease word error rate and to compute confidence measures, but they are also used in many ways in order to improve post-processing of ASR outputs. For instance, they can be helpfully used to propose alternative word hypotheses when ASR outputs are corrected by a human on post-edition. However, CNs bins do not have a fixed length, and sometimes contain only one or two word hypotheses: in this case the number of alternatives to correct a misrecognized word is very low, reducing the chance of helping the human annotator. Our approach for CN enrichment is based on a new similarity measure presented in this paper, computed from acoustic and linguistic word embeddings, that allows us to take into consideration both acoustic and linguistic similarities between two words. Experimental results show that our approach is relevant: enriched CNs (for a bin size equals to 6) increase the potential correction of erroneous words by 23% than initial CNs produced by an ASR system. In our experiments, a spoken language understanding task is also targeted. Index Terms: Confusion networks, post processing, acoustic and linguistic word em-beddings, similarity measure.
Type de document :
Communication dans un congrès
Statistical Language and Speech Processing 2017, Oct 2017, Le Mans, France. 〈http://grammars.grlmc.com/SLSP2017/〉
Liste complète des métadonnées

Littérature citée [31 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01585768
Contributeur : Yannick Estève <>
Soumis le : mardi 12 septembre 2017 - 01:04:01
Dernière modification le : jeudi 11 janvier 2018 - 15:18:04
Document(s) archivé(s) le : mercredi 13 décembre 2017 - 15:19:43

Fichier

enriching-confusion-networks.p...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01585768, version 1

Collections

Citation

Sahar Ghannay, Yannick Estève, Nathalie Camelin. Enriching confusion networks for post-processing. Statistical Language and Speech Processing 2017, Oct 2017, Le Mans, France. 〈http://grammars.grlmc.com/SLSP2017/〉. 〈hal-01585768〉

Partager

Métriques

Consultations de la notice

202

Téléchargements de fichiers

86