Statistical Conversion of Silent Articulation into Audible Speech using Full-Covariance HMM - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Computer Speech and Language Année : 2016

Statistical Conversion of Silent Articulation into Audible Speech using Full-Covariance HMM

Résumé

This article investigates the use of statistical mapping techniques for the conversion of articulatory movements into audible speech with no restriction on the vocabulary, in the context of a silent speech interface driven by ultrasound and video imaging. As a baseline, we first evaluated the GMM-based mapping considering dynamic features, proposed by Toda et al. (2007) for voice conversion. Then, we proposed a ‘phonetically-informed’ version of this technique, based on full-covariance HMM. This approach aims (1) at modeling explicitly the articulatory timing for each phonetic class, and (2) at exploiting linguistic knowledge to regularize the problem of silent speech conversion. Both techniques were compared on continuous speech, for two French speakers (one male, one female). For modal speech, the HMM-based technique showed a lower spectral distortion (objective evaluation). However, perceptual tests (transcription and XAB discrimination tests) showed a better intelligibility of the GMM-based technique, probably related to its less fluctuant quality. For silent speech, a perceptual identification test revealed a better segmental intelligibility for the HMM-based technique on consonants.
Fichier non déposé

Dates et versions

hal-01228885 , version 1 (14-11-2015)

Identifiants

Citer

Thomas Hueber, Gérard Bailly. Statistical Conversion of Silent Articulation into Audible Speech using Full-Covariance HMM. Computer Speech and Language, 2016, 36, pp.274-293. ⟨10.1016/j.csl.2015.03.005⟩. ⟨hal-01228885⟩
147 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More