Skip to Main content Skip to Navigation
Conference papers

Enriching confusion networks for post-processing

Abstract : The paper proposes a new approach for a posteriori enrichment of automatic speech recognition (ASR) confusion networks (CNs). CNs are usually needed to decrease word error rate and to compute confidence measures, but they are also used in many ways in order to improve post-processing of ASR outputs. For instance, they can be helpfully used to propose alternative word hypotheses when ASR outputs are corrected by a human on post-edition. However, CNs bins do not have a fixed length, and sometimes contain only one or two word hypotheses: in this case the number of alternatives to correct a misrecognized word is very low, reducing the chance of helping the human annotator. Our approach for CN enrichment is based on a new similarity measure presented in this paper, computed from acoustic and linguistic word embeddings, that allows us to take into consideration both acoustic and linguistic similarities between two words. Experimental results show that our approach is relevant: enriched CNs (for a bin size equals to 6) increase the potential correction of erroneous words by 23% than initial CNs produced by an ASR system. In our experiments, a spoken language understanding task is also targeted. Index Terms: Confusion networks, post processing, acoustic and linguistic word em-beddings, similarity measure.
Complete list of metadatas

Cited literature [31 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01585768
Contributor : Yannick Estève <>
Submitted on : Tuesday, September 12, 2017 - 1:04:01 AM
Last modification on : Thursday, January 11, 2018 - 3:18:04 PM
Document(s) archivé(s) le : Wednesday, December 13, 2017 - 3:19:43 PM

File

enriching-confusion-networks.p...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01585768, version 1

Collections

Citation

Sahar Ghannay, Yannick Estève, Nathalie Camelin. Enriching confusion networks for post-processing. Statistical Language and Speech Processing 2017, Oct 2017, Le Mans, France. ⟨hal-01585768⟩

Share

Metrics

Record views

305

Files downloads

304