GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features

Zuheng Ming; Denis Beautemps; Gang Feng

Communication Dans Un Congrès Année : 2013

GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features

(1) , (2) , (2)

1
2

Zuheng Ming

Fonction : Auteur
PersonId : 735516
IdHAL : zuheng-ming
ORCID : 0000-0002-1094-3112

Service d'ORL et de chirurgie cervicale

Denis Beautemps

Fonction : Auteur
PersonId : 18206
IdHAL : denis-beautemps
ORCID : 0000-0001-9625-3018
IdRef : 099427524

GIPSA - Machines parlantes, Gestes oro-faciaux, Interaction Face-à-face, Communication augmentée

Gang Feng

Fonction : Auteur
PersonId : 882696

GIPSA - Machines parlantes, Gestes oro-faciaux, Interaction Face-à-face, Communication augmentée

Résumé

In this paper, we present a statistical method based on GMM modeling to map the acoustic speech spectral features to visual features of Cued Speech in the regression criterion of Minimum Mean-Square Error (MMSE) in a low signal level which is innovative and different with the classic text-to-visual approach. Two different training methods for GMM, namely Expectation-Maximization (EM) approach and supervised training method were discussed respectively. In comparison with the GMM based mapping modeling we first present the results with the use of a Multiple-Linear Regression (MLR) model also at the low signal level and study the limitation of the approach. The experimental results demonstrate that the GMM based mapping method can significantly improve the mapping performance compared with the MLR mapping model especially in the sense of the weak linear correlation between the target and the predictor such as the hand positions of Cued Speech and the acoustic speech spectral features.

Mots clés

GMM mapping MFCC LSP Cued Speech

Domaines

Sciences de l'information et de la communication

Fichier principal

MingBeautempsGang-AVSP-2013-revised.pdf (779 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Denis Beautemps : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00863875

Soumis le : jeudi 19 septembre 2013-17:08:08

Dernière modification le : jeudi 4 avril 2024-21:07:00

Archivage à long terme le : vendredi 20 décembre 2013-16:05:22

Dates et versions

hal-00863875 , version 1 (19-09-2013)

Identifiants

HAL Id : hal-00863875 , version 1

Citer

Zuheng Ming, Denis Beautemps, Gang Feng. GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features. AVSP 2013 - 12th International Conference on Auditory-Visual Speech Processing, Aug 2013, Annecy, France. pp.191 - 196. ⟨hal-00863875⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS GIPSA GIPSA-DPC GIPSA-MAGIC

434 Consultations

208 Téléchargements

GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager