Merging of Native and Non-native Speech for Low-resource Accented ASR

Abstract : This paper presents our recent study on low-resource automatic speech recognition (ASR) system with accented speech. We propose multi-accent Subspace Gaussian Mixture Models (SGMM) and accent-specific Deep Neural Networks (DNN) for improving non-native ASR performance. In the SGMM framework, we present an original language weighting strategy to merge the globally shared parameters of two models based on native and non-native speech respectively. In the DNN framework, a native deep neural net is fine-tuned to non-native speech. Over the non-native baseline, we achieved relative improvement of 15 % for multi-accent SGMM and 34 % for accent-specific DNN with speaker adaptation.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [24 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01289140
Contributor : Benjamin Lecouteux <>
Submitted on : Monday, January 29, 2018 - 10:57:21 AM
Last modification on : Tuesday, February 12, 2019 - 1:31:31 AM
Document(s) archivé(s) le : Friday, May 25, 2018 - 10:31:53 AM

File

SLSP2015-sarah.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Sarah Samson Juan, Laurent Besacier, Benjamin Lecouteux, Tien-Ping Tan. Merging of Native and Non-native Speech for Low-resource Accented ASR. 3rd International Conference on Statistical Language and Speech Processing, SLSP 2015, Nov 2015, Budapest, Hungary. ⟨10.1007/978-3-319-25789-1_24⟩. ⟨hal-01289140⟩

Share

Metrics

Record views

195

Files downloads

119