Area of mouth opening estimation from speech acoustics using blind deconvolution technique - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2009

Area of mouth opening estimation from speech acoustics using blind deconvolution technique

Résumé

We propose a new method for estimation of area of mouth open- ing from a video sequence of the speaking person. In a paper pub- lished in 2000, Grant and Seitz have reported the different degrees of correlation between acoustic envelopes and visible movements. In our method, we exploit these correlations to establish a mathe- matical model of a Single-Input Multiple-Output (SIMO) system in which the area of mouth opening is the unknown Single Input that we need to estimate. The subband Root Mean Squared (RMS) energies of the speech signal are the observable Multiple Outputs of the model. The unknown input signal can be directly estimated by using the existing blind deconvolution techniques. Our method necessitates only an audio sequence to estimate directly the area of mouth opening in the corresponding video sequence. Con- sequently, using this method permits us to avoid using complex images processing techniques of the conventional visual features extraction methods, or the training of the estimators in the audio- to-visual mapping methods. The audio-visual sequences used for the estimation tests have been recorded by an ordinary webcam. Estimation result is promising; the estimated area of mouth open- ing is sufficiently correlated with the manually measured one; the average of correlation coefficients obtained by the most effective configuration of the proposed method, on a set of 16 French sen- tences, is 0.73.
Fichier non déposé

Dates et versions

hal-01770963 , version 1 (19-04-2018)

Identifiants

  • HAL Id : hal-01770963 , version 1

Citer

Cong Thanh Do, Abdeldjalil Aissa El Bey, Dominique Pastor, André Goalic. Area of mouth opening estimation from speech acoustics using blind deconvolution technique. AVSP 2009 : 8th International conference on auditory-visual speech processing, Sep 2009, Norwich, United Kingdom. pp.80 - 85. ⟨hal-01770963⟩
45 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More