Audio-Visual speech recognition and segmental master-slave HMM

Régine André-Obrecht; Bruno Jacob; Nathalie Parlangeau

Communication Dans Un Congrès Année : 1997

Audio-Visual speech recognition and segmental master-slave HMM

(1) , (1) , (1)

Régine André-Obrecht

Fonction : Auteur
PersonId : 740810
IdHAL : obrecht
IdRef : 060375965

Institut de recherche en informatique de Toulouse

Bruno Jacob

Fonction : Auteur
PersonId : 1000123

Institut de recherche en informatique de Toulouse

Nathalie Parlangeau

Fonction : Auteur
PersonId : 752506
IdHAL : nathalie-valles-parlangeau
ORCID : 0000-0002-4463-5177
IdRef : 129047805

Institut de recherche en informatique de Toulouse

Résumé

Our work deals with the classical problem of merging heterogenous and asynchronous parameters. It's well known that lips reading improves the speech recognition score, specially in noise condition ; so we study more precisely the modeling of acoustic and labial parameters to propose two Automatic Speech Recognition Systems: a Direct Identification is performed by using a classical HMM approach: no correlation between visual and acoustic parameters is assumed. two correlated models : a master HMM and a slave HMM, process respectively the labial observations and the acoustic To assess each approach, we use a segmental pre-processing and an acoustic robust elementary unit "the pseudodiphone". Our task is the recognition of spelled french letters, in clear and noisy ( cocktail party ) environments. Whatever the approach and condition, the introduction of labial features improves the performances, but the difference between the two models isn't enough sufficient to provide any priority.

Domaines

Informatique et langage [cs.CL]

Fichier principal

av97_049.pdf (280.38 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

HAKIM AMOKRANE : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01437201

Soumis le : mercredi 23 mars 2022-15:49:54

Dernière modification le : lundi 20 novembre 2023-11:44:20

Archivage à long terme le : vendredi 24 juin 2022-19:43:07

Dates et versions

hal-01437201 , version 1 (23-03-2022)

Identifiants

HAL Id : hal-01437201 , version 1

Citer

Régine André-Obrecht, Bruno Jacob, Nathalie Parlangeau. Audio-Visual speech recognition and segmental master-slave HMM. ESCA Workshop on Audio-Visual Speech Processing (AVSP 1997), Sep 1997, Rhodes, Greece. pp.49-52. ⟨hal-01437201⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS UT1-CAPITOLE LIUM IRIT TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

87 Consultations

13 Téléchargements

Audio-Visual speech recognition and segmental master-slave HMM

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager