Automatic recognition of French Cued Speech using multimodal fusion based on hidden Markov models - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue IEEE Transactions on Audio, Speech and Language Processing Année : 2009

Automatic recognition of French Cued Speech using multimodal fusion based on hidden Markov models

Résumé

In this article, automatic recognition of French Cued Speech based on hidden Markov models (HMM) is presented. Cued Speech is a visual system which uses handshapes in different positions and in combination with lip-patterns of speech, and makes all the sounds of spoken language clearly understandable to deaf and hearing-impaired people. The aim of Cued Speech is to overcome the problems of lipreading and thus enable deaf children and adults to understand full spoken language. In automatic recognition of Cued Speech, lip shape and gesture recognition are required. In addition, the integration of the two modalities is of the greatest importance. In this study, lip shape component is fused with gestures components to realize Cued Speech recognition. Using concatenative feature fusion and multi-stream HMM decision fusion, vowel recognition and consonant recognition experiments have been conducted. For vowel recognition, an 87.6% vowel accuracy was obtained showing a 61.3% relative improvement compared to the sole use of lip shape parameters. In the case of consonant recognition, a 78.9% accuracy was obtained showing a 56% relative improvement compared with the use of lip shape only. In addition to vowel and consonant recognition, a complete phoneme recognition experiment using concatenated feature vectors and Gaussian mixture model (GMM) discrimination has been conducted showing a 74.4% phoneme accuracy. The obtained results were compared to the results obtained using the audio signal showing comparable accuracies. The achieved results show the effectiveness of the proposed approaches as far as Cued Speech recognition is concerned.
Fichier principal
Vignette du fichier
IEEETASL_CUEDSPEECH.pdf (459.79 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00346166 , version 1 (11-12-2008)
hal-00346166 , version 2 (09-02-2009)

Identifiants

  • HAL Id : hal-00346166 , version 1

Citer

Panikos Heracleous, Noureddine Aboutabit, Denis Beautemps. Automatic recognition of French Cued Speech using multimodal fusion based on hidden Markov models. IEEE Transactions on Audio, Speech and Language Processing, 2009, pp.1-8. ⟨hal-00346166v1⟩
180 Consultations
790 Téléchargements

Partager

Gmail Facebook X LinkedIn More