GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features

Zuheng Ming 1 Denis Beautemps 2 Gang Feng 2
GIPSA-DPC - Département Parole et Cognition
Abstract : In this paper, we present a statistical method based on GMM modeling to map the acoustic speech spectral features to visual features of Cued Speech in the regression criterion of Minimum Mean-Square Error (MMSE) in a low signal level which is innovative and different with the classic text-to-visual approach. Two different training methods for GMM, namely Expectation-Maximization (EM) approach and supervised training method were discussed respectively. In comparison with the GMM based mapping modeling we first present the results with the use of a Multiple-Linear Regression (MLR) model also at the low signal level and study the limitation of the approach. The experimental results demonstrate that the GMM based mapping method can significantly improve the mapping performance compared with the MLR mapping model especially in the sense of the weak linear correlation between the target and the predictor such as the hand positions of Cued Speech and the acoustic speech spectral features.
Document type :
Conference papers
Complete list of metadatas

Cited literature [21 references]  Display  Hide  Download
Contributor : Denis Beautemps <>
Submitted on : Thursday, September 19, 2013 - 5:08:08 PM
Last modification on : Friday, September 28, 2018 - 1:15:13 AM
Long-term archiving on : Friday, December 20, 2013 - 4:05:22 PM


Files produced by the author(s)


  • HAL Id : hal-00863875, version 1


Zuheng Ming, Denis Beautemps, Gang Feng. GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features. 12th International Conference on Auditory-Visual Speech Processing (AVSP 2013), Aug 2013, Annecy, France. pp.191 - 196. ⟨hal-00863875⟩



Record views


Files downloads