Merging of Native and Non-native Speech for Low-resource Accented ASR

Sarah Samson Juan; Laurent Besacier; Benjamin Lecouteux; Tien-Ping Tan

doi:10.1007/978-3-319-25789-1_24

Communication Dans Un Congrès Année : 2015

Merging of Native and Non-native Speech for Low-resource Accented ASR

(1) , (1) , (1) ,

Sarah Samson Juan

Fonction : Auteur

Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole

Laurent Besacier

Fonction : Auteur
PersonId : 1521
IdHAL : laurent-besacier
ORCID : 0000-0001-7411-9125
IdRef : 079377017

Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole

Benjamin Lecouteux

Fonction : Auteur
PersonId : 7847
IdHAL : benjamin-lecouteux
ORCID : 0000-0003-3000-6190
IdRef : 135355060

Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole

Tien-Ping Tan

Fonction : Auteur

Résumé

This paper presents our recent study on low-resource automatic speech recognition (ASR) system with accented speech. We propose multi-accent Subspace Gaussian Mixture Models (SGMM) and accent-specific Deep Neural Networks (DNN) for improving non-native ASR performance. In the SGMM framework, we present an original language weighting strategy to merge the globally shared parameters of two models based on native and non-native speech respectively. In the DNN framework, a native deep neural net is fine-tuned to non-native speech. Over the non-native baseline, we achieved relative improvement of 15 % for multi-accent SGMM and 34 % for accent-specific DNN with speaker adaptation.

Mots clés

Cross-lingual acoustic modelling Multi-accent SGMM automatic speech recognition cross-lingual acoustic modelling non-native speech low-resource system multi-accent SGMM accent-specific DNN

Domaines

Informatique et langage [cs.CL]

Fichier principal

SLSP2015-sarah.pdf (639.64 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Benjamin Lecouteux : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01289140

Soumis le : lundi 29 janvier 2018-10:57:21

Dernière modification le : jeudi 4 avril 2024-21:03:24

Archivage à long terme le : vendredi 25 mai 2018-10:31:53

Dates et versions

hal-01289140 , version 1 (29-01-2018)

Identifiants

HAL Id : hal-01289140 , version 1
DOI : 10.1007/978-3-319-25789-1_24

Citer

Sarah Samson Juan, Laurent Besacier, Benjamin Lecouteux, Tien-Ping Tan. Merging of Native and Non-native Speech for Low-resource Accented ASR. 3rd International Conference on Statistical Language and Speech Processing, SLSP 2015, Nov 2015, Budapest, Hungary. ⟨10.1007/978-3-319-25789-1_24⟩. ⟨hal-01289140⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS LIG LIG_TDCGE_GETALP POLYTECH-GRENOBLE LIG_SIDCH

152 Consultations

237 Téléchargements

Merging of Native and Non-native Speech for Low-resource Accented ASR

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager