Bag of n-gram driven decoding for LVCSR system harnessing

Abstract : —This paper focuses on automatic speech recognition systems combination based on driven decoding paradigms. The driven decoding algorithm (DDA) involves the use of a 1-best hypothesis provided by an auxiliary system as another knowledge source in the search algorithm of a primary system. In previous studies, it was shown that DDA outperforms ROVER when the primary system is guided by a more accurate system. In this paper we propose a new method to manage auxiliary transcriptions which are presented as a bag-of-n-grams (BONG) without temporal matching. These modifications allow to make easier the combination of several hypotheses given by different auxiliary systems. Using BONG combination with hypotheses provided by two auxiliary systems, each of which obtained more than 23% of WER on the same data, our experiments show that a CMU Sphinx based ASR system can reduce its WER from 19.85% to 18.66% which is better than the results reached with DDA or classical ROVER combination.
Type de document :
Communication dans un congrès
IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Dec 2011, Waikoloa, United States
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01315538
Contributeur : Bibliothèque Universitaire Déposants Hal-Avignon <>
Soumis le : vendredi 13 mai 2016 - 12:32:30
Dernière modification le : samedi 23 mars 2019 - 01:22:35

Identifiants

  • HAL Id : hal-01315538, version 1

Citation

Bougares Fethi, Yannick Estevez, Paul Deléglise, Georges Linarès. Bag of n-gram driven decoding for LVCSR system harnessing. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Dec 2011, Waikoloa, United States. 〈hal-01315538〉

Partager

Métriques

Consultations de la notice

112