Speed perturbation and vowel duration modeling for ASR in Hausa and Wolof languages

Abstract : Automatic Speech Recognition (ASR) for (under-resourced) Sub-Saharan African languages faces several challenges: small amount of transcribed speech, written language normalization issues, few text resources available for language modeling, as well as specific features (tones, morphology, etc.) that need to be taken into account seriously to optimize ASR performance. This paper tries to address some of the above challenges through the development of ASR systems for two Sub-Saharan African languages: Hausa and Wolof. First, we investigate data augmentation technique (through speed perturbation) to overcome the lack of resources. Secondly, the main contribution is our attempt to model vowel length contrast existing in both languages. For reproducible experiments, the ASR systems developed for Hausa and Wolof are made available to the research community on github. To our knowledge, the Wolof ASR system presented in this paper is the first large vocabulary continuous speech recognition system ever developed for this language.
Type de document :
Communication dans un congrès
Interspeech 2016, Sep 2016, San-Francisco, United States. Interspeech 2016 proceedings
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01350057
Contributeur : Laurent Besacier <>
Soumis le : vendredi 29 juillet 2016 - 15:14:46
Dernière modification le : mercredi 31 octobre 2018 - 12:24:05
Document(s) archivé(s) le : dimanche 30 octobre 2016 - 11:20:54

Fichier

speed-perturbation-vowelFINAL-...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01350057, version 1

Collections

Citation

Elodie Gauthier, Laurent Besacier, Sylvie Voisin. Speed perturbation and vowel duration modeling for ASR in Hausa and Wolof languages. Interspeech 2016, Sep 2016, San-Francisco, United States. Interspeech 2016 proceedings. 〈hal-01350057〉

Partager

Métriques

Consultations de la notice

489

Téléchargements de fichiers

310