Automatic recognition of French Cued Speech using multimodal fusion based on hidden Markov models

Panikos Heracleous; Noureddine Aboutabit; Denis Beautemps

Article Dans Une Revue IEEE Transactions on Audio, Speech and Language Processing Année : 2009

Automatic recognition of French Cued Speech using multimodal fusion based on hidden Markov models

(1) , (1) , (1)

Panikos Heracleous

Fonction : Auteur

Grenoble Images Parole Signal Automatique

Noureddine Aboutabit

Fonction : Auteur

Grenoble Images Parole Signal Automatique

Denis Beautemps

Fonction : Auteur
PersonId : 18206
IdHAL : denis-beautemps
ORCID : 0000-0001-9625-3018
IdRef : 099427524

Grenoble Images Parole Signal Automatique

Résumé

In this article, automatic recognition of French Cued Speech based on hidden Markov models (HMM) is presented. Cued Speech is a visual system which uses handshapes in different positions and in combination with lip-patterns of speech, and makes all the sounds of spoken language clearly understandable to deaf and hearing-impaired people. The aim of Cued Speech is to overcome the problems of lipreading and thus enable deaf children and adults to understand full spoken language. In automatic recognition of Cued Speech, lip shape and gesture recognition are required. In addition, the integration of the two modalities is of the greatest importance. In this study, lip shape component is fused with gestures components to realize Cued Speech recognition. Using concatenative feature fusion and multi-stream HMM decision fusion, vowel recognition and consonant recognition experiments have been conducted. For vowel recognition, an 87.6% vowel accuracy was obtained showing a 61.3% relative improvement compared to the sole use of lip shape parameters. In the case of consonant recognition, a 78.9% accuracy was obtained showing a 56% relative improvement compared with the use of lip shape only. In addition to vowel and consonant recognition, a complete phoneme recognition experiment using concatenated feature vectors and Gaussian mixture model (GMM) discrimination has been conducted showing a 74.4% phoneme accuracy. The obtained results were compared to the results obtained using the audio signal showing comparable accuracies. The achieved results show the effectiveness of the proposed approaches as far as Cued Speech recognition is concerned.

Mots clés

French Cued Speech hidden Markov models automatic feature fusion multi-stream HMM decision fusion

Domaines

Sciences de l'information et de la communication

Fichier principal

IEEETASL_CUEDSPEECH.pdf (459.79 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Denis Beautemps : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00346166

Soumis le : jeudi 11 décembre 2008-10:57:08

Dernière modification le : mercredi 12 avril 2023-13:42:09

Archivage à long terme le : mardi 8 juin 2010-16:15:36

Dates et versions

hal-00346166 , version 1 (11-12-2008)

hal-00346166 , version 2 (09-02-2009)

Identifiants

HAL Id : hal-00346166 , version 1

Citer

Panikos Heracleous, Noureddine Aboutabit, Denis Beautemps. Automatic recognition of French Cued Speech using multimodal fusion based on hidden Markov models. IEEE Transactions on Audio, Speech and Language Processing, 2009, pp.1-8. ⟨hal-00346166v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

180 Consultations

790 Téléchargements

Automatic recognition of French Cued Speech using multimodal fusion based on hidden Markov models

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager