Skip to Main content Skip to Navigation
Conference papers

Using closely-related language to build an ASR for a very under-resourced language: Iban

Abstract : This paper describes our work on automatic speech recognition system (ASR) for an under-resourced language, Iban, a language that is mainly spoken in Sarawak, Malaysia. We collected 8 hours of data to begin this study due to no resources for ASR exist. We employed bootstrapping techniques involving a closely-related language for rapidly building and improve an Iban system. First, we used already available data from Malay, a local dominant language in Malaysia, to bootstrap grapheme-to-phoneme system (G2P) for the target language. We also built various types of G2Ps, including a grapheme-based and an English G2P, to produce different versions of dictionaries. We tested all of the dictionaries on the Iban ASR to provide us the best version. Second, we improved the baseline GMM system word error rate (WER) result by utilizing subspace Gaussian mixture models (SGMM). To test, we set two levels of data sparseness on Iban data; 7 hours and 1 hour transcribed speech. We investigated cross-lingual SGMM where the shared parameters were obtained either in monolingual or multilingual fashion and then applied to the target language for training. Experiments on out-of-language data, English and Malay, as source languages result in lower WERs when Iban data is very limited.
Document type :
Conference papers
Complete list of metadatas

Cited literature [28 references]  Display  Hide  Download
Contributor : Laurent Besacier <>
Submitted on : Wednesday, August 13, 2014 - 10:14:37 AM
Last modification on : Friday, July 17, 2020 - 11:10:26 AM
Long-term archiving on: : Wednesday, November 26, 2014 - 11:50:32 PM


Files produced by the author(s)


  • HAL Id : hal-01055576, version 1


Sarah Samson Juan, Laurent Besacier, Benjamin Lecouteux, Tan Tien Ping. Using closely-related language to build an ASR for a very under-resourced language: Iban. Oriental COCOSDA 2014, Sep 2014, Phuket, Thailand. 5 p. ⟨hal-01055576⟩



Record views


Files downloads