Audio-Visual speech recognition and segmental master-slave HMM - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 1997

Audio-Visual speech recognition and segmental master-slave HMM

Résumé

Our work deals with the classical problem of merging heterogenous and asynchronous parameters. It's well known that lips reading improves the speech recognition score, specially in noise condition ; so we study more precisely the modeling of acoustic and labial parameters to propose two Automatic Speech Recognition Systems: a Direct Identification is performed by using a classical HMM approach: no correlation between visual and acoustic parameters is assumed. two correlated models : a master HMM and a slave HMM, process respectively the labial observations and the acoustic To assess each approach, we use a segmental pre-processing and an acoustic robust elementary unit "the pseudodiphone". Our task is the recognition of spelled french letters, in clear and noisy ( cocktail party ) environments. Whatever the approach and condition, the introduction of labial features improves the performances, but the difference between the two models isn't enough sufficient to provide any priority.
Fichier principal
Vignette du fichier
av97_049.pdf (280.38 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-01437201 , version 1 (23-03-2022)

Identifiants

  • HAL Id : hal-01437201 , version 1

Citer

Régine André-Obrecht, Bruno Jacob, Nathalie Parlangeau. Audio-Visual speech recognition and segmental master-slave HMM. ESCA Workshop on Audio-Visual Speech Processing (AVSP 1997), Sep 1997, Rhodes, Greece. pp.49-52. ⟨hal-01437201⟩
87 Consultations
13 Téléchargements

Partager

Gmail Facebook X LinkedIn More