Language Model Data Augmentation for Keyword Spotting

Arseniy Gorin; Rasa Lileikyté; Guangpu Huang; Lori Lamel; Jean-Luc Gauvain; Antoine Laurent

Communication Dans Un Congrès Année : 2016

Language Model Data Augmentation for Keyword Spotting

(1) , (1) , (1) , (1) , (1) , (1)

Arseniy Gorin

Fonction : Auteur
PersonId : 1034367

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Rasa Lileikyté

Fonction : Auteur
PersonId : 1034366

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Guangpu Huang

Fonction : Auteur
PersonId : 1034371

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Lori Lamel

Fonction : Auteur
PersonId : 15965
IdHAL : lori-lamel
ORCID : 0000-0001-7443-9938
IdRef : 127578056

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Jean-Luc Gauvain

Fonction : Auteur
PersonId : 1034324

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Antoine Laurent

Fonction : Auteur
PersonId : 1034318

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Résumé

This research extends our earlier work on using machine translation (MT) and word-based recurrent neural networks to augment language model training data for keyword search in conversational Cantonese speech. MT-based data augmenta- tion is applied to two language pairs: English-Lithuanian and English-Amharic. Using filtered N-best MT hypotheses for lan- guage modeling is found to perform better than just using the 1- best translation. Target language texts collected from the Web and filtered to select conversational-like data are used in several manners. In addition to using Web data for training the language model of the speech recognizer, we further investigate using this data to improve the language model and phrase table of the MT system to get better translations of the English data. Finally, generating text data with a character-based recurrent neural net- work is investigated. This approach allows new word forms to be produced, providing a way to reduce the out-of-vocabulary rate and thereby improve keyword spotting performance. We study how these different methods of language model data aug- mentation impact speech-to-text and keyword spotting perfor- mance for the Lithuanian and Amharic languages. The best re- sults are obtained by combining all of the explored methods.

Mots clés

speech recognition text augmentation language modeling machine translation low-resourced languages

Domaines

Informatique [cs] Informatique et langage [cs.CL]

Limsi Publications : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01837186

Soumis le : jeudi 12 juillet 2018-17:24:10

Dernière modification le : samedi 7 octobre 2023-21:36:20

Dates et versions

hal-01837186 , version 1 (12-07-2018)

Identifiants

HAL Id : hal-01837186 , version 1

Citer

Arseniy Gorin, Rasa Lileikyté, Guangpu Huang, Lori Lamel, Jean-Luc Gauvain, et al.. Language Model Data Augmentation for Keyword Spotting. Annual Conference of the International Speech Communication Association , Jan 2016, San Francisco, United States. ⟨hal-01837186⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LIMSI UNIV-PARIS-SACLAY SORBONNE-UNIVERSITE LISN GS-ENGINEERING GS-COMPUTER-SCIENCE

44 Consultations

0 Téléchargements

Language Model Data Augmentation for Keyword Spotting

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager