Skip to Main content Skip to Navigation
Conference papers

Recent advances in Automatic Speech Recognition for Vietnamese

Abstract : This paper presents our recent activities for automatic speech recognition for Vietnamese. First, our text data collection and processing methods and tools are described. For language modeling, we investigate word, sub-word and also hybrid word/sub-word models. For acoustic modeling, when only limited speech data are available for Vietnamese, we propose some crosslingual acoustic modeling techniques. Furthermore, since the use of sub-word units can reduce the high out-of-vocabulary rate and improve the lack of text resources in statistical language modeling, we propose several methods to decompose, normalize and combine word and sub-word lattices generated from different ASR systems. Experimental results evaluated on the VnSpeechCorpus demonstrate the feasibility of our methods.
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download
Contributor : Brigitte Bigi <>
Submitted on : Wednesday, February 14, 2018 - 10:29:19 AM
Last modification on : Thursday, February 27, 2020 - 10:44:03 AM
Long-term archiving on: : Friday, May 4, 2018 - 10:13:33 AM


Publisher files allowed on an open archive


  • HAL Id : hal-01705670, version 1



Viet-Bac Le, Laurent Besacier, Sopheap Seng, Brigitte Bigi, Thi-Ngoc-Diep Do. Recent advances in Automatic Speech Recognition for Vietnamese. The first International Workshop on Spoken Languages Technologies for Under-resourced languages, 2008, Hanoi, Vietnam. ⟨hal-01705670⟩



Record views


Files downloads