Skip to Main content Skip to Navigation
New interface
Conference papers

Evaluation of Feature-Space Speaker Adaptation for End-to-End Acoustic Models

Abstract : This paper investigates speaker adaptation techniques for bidirectional long short term memory (BLSTM) recurrent neural network based acoustic models (AMs) trained with the connectionist temporal classification (CTC) objective function. BLSTM-CTC AMs play an important role in end-to-end automatic speech recognition systems. However, there is a lack of research in speaker adaptation algorithms for these models. We explore three different feature-space adaptation approaches for CTC AMs: feature-space maximum linear regression, i-vector based adaptation, and maximum a posteriori adaptation using GMM-derived features. Experimental results on the TED-LIUM corpus demonstrate that speaker adaptation, applied in combination with data augmentation techniques, provides, in an unsupervised adaptation mode, for different test sets, up to 11--20% of relative word error rate reduction over the baseline model built on the raw filter-bank features. In addition, the adaptation behavior is compared for BLSTM-CTC AMs and time-delay neural network AMs trained with the cross-entropy criterion.
Complete list of metadata
Contributor : Yannick Estève Connect in order to contact the contributor
Submitted on : Monday, March 26, 2018 - 10:37:12 AM
Last modification on : Thursday, March 29, 2018 - 1:01:00 AM


Files produced by the author(s)


  • HAL Id : hal-01728526, version 1



Natalia Tomashenko, Yannick Estève. Evaluation of Feature-Space Speaker Adaptation for End-to-End Acoustic Models. LREC 2018, May 2018, Miyazaki, Japan. ⟨hal-01728526⟩



Record views


Files downloads