Using Resources from a Closely-related Language to Develop ASR for a Very Under-resourced Language: A Case Study for Iban - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Using Resources from a Closely-related Language to Develop ASR for a Very Under-resourced Language: A Case Study for Iban

Résumé

This paper presents our strategies for developing an automatic speech recognition system for Iban, an under-resourced language. We faced several challenges such as no pronunciation dictionary and lack of training material for building acoustic models. To overcome these problems, we proposed approaches which exploit resources from a closely-related language (Malay). We developed a semi-supervised method for building the pronunciation dictionary and applied cross-lingual strategies for improving acoustic models trained with very limited training data. Both approaches displayed very encouraging results, which show that data from a closely-related language, if available, can be exploited to build ASR for a new language. In the final part of the paper, we present a zero-shot ASR using Malay resources that can be used as an alternative method for transcribing Iban speech.
Fichier principal
Vignette du fichier
IS2015_samsonjuan_camera-ready.pdf (124.04 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01170493 , version 1 (15-09-2015)

Identifiants

  • HAL Id : hal-01170493 , version 1

Citer

Sarah Samson Juan, Laurent Besacier, Benjamin Lecouteux, Mohamed Dyab. Using Resources from a Closely-related Language to Develop ASR for a Very Under-resourced Language: A Case Study for Iban. Interspeech 2015, Sep 2015, Dresden, Germany. ⟨hal-01170493⟩
388 Consultations
384 Téléchargements

Partager

Gmail Facebook X LinkedIn More