Hand shape Coding for HMM-based Consonant Recognition in Cued Speech for French
Résumé
Cued Speech (CS) is a visual communication mode that makes use of hand shapes placed in different positions near the face in combination with the natural speech lipreading, to enhance speech perception from visual input. This system is based on the motions of the speaker's hand moving in close relation with speech. In a CS system, hand shapes are designed to distinguish among consonants and hand placements are used to distinguish among vowels. Due to the CS system, both manual and lip flows produced by the CS speaker carry a part of the phonetic information. This contribution presents automatic hand shape coding of a CS video recording with 92% obtained accuracy, and multistream hidden Markov models (HMMs) fusion to integrate hand shape and lip shape elements into a combined component and perform automatic recognition of CS for French. Compared with using lip shape modality alone, by applying fusion the accuracy of CS consonant recognition was raised from 52.1% to 79.6%.
Origine : Fichiers produits par l'(les) auteur(s)
Loading...