Skip to Main content Skip to Navigation
Conference papers

Audio-Visual speech recognition and segmental master-slave HMM

Abstract : Our work deals with the classical problem of merging heterogenous and asynchronous parameters. It's well known that lips reading improves the speech recognition score, specially in noise condition ; so we study more precisely the modeling of acoustic and labial parameters to propose two Automatic Speech Recognition Systems: a Direct Identification is performed by using a classical HMM approach: no correlation between visual and acoustic parameters is assumed. two correlated models : a master HMM and a slave HMM, process respectively the labial observations and the acoustic To assess each approach, we use a segmental pre-processing and an acoustic robust elementary unit "the pseudodiphone". Our task is the recognition of spelled french letters, in clear and noisy ( cocktail party ) environments. Whatever the approach and condition, the introduction of labial features improves the performances, but the difference between the two models isn't enough sufficient to provide any priority.
Document type :
Conference papers
Complete list of metadata
Contributor : HAKIM AMOKRANE Connect in order to contact the contributor
Submitted on : Wednesday, March 23, 2022 - 3:49:54 PM
Last modification on : Wednesday, June 1, 2022 - 4:39:32 AM
Long-term archiving on: : Friday, June 24, 2022 - 7:43:07 PM


Publisher files allowed on an open archive


  • HAL Id : hal-01437201, version 1


Régine André-Obrecht, Bruno Jacob, Nathalie Parlangeau. Audio-Visual speech recognition and segmental master-slave HMM. ESCA Workshop on Audio-Visual Speech Processing (AVSP 1997), Sep 1997, Rhodes, Greece. pp.49-52. ⟨hal-01437201⟩



Record views


Files downloads