Mapping between Acoustic and Articulatory Gestures - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Speech Communication Année : 2011

Mapping between Acoustic and Articulatory Gestures

Résumé

This paper proposes a definition for articulatory as well as acoustic gestures along with a method to segment the measured articulatory trajectories and the acoustic waveform into gestures. Using an simultaneously recorded acoustic-articulatory database, the gestures are detected based on finding critical points in the utterance both in the acoustic and articulatory representations. The acoustic gestures are parameterized using 2-D cepstral coefficients. The articulatory trajectories are essentially the horizontal and vertical movements of Electromagnetic Articulagraphy (EMA) coils placed on the tongue, jaw and lips along the midsagittal plane. The articulatory movements are parameterized using 2D-DCT using the same transformation that is applied on the acoustics. The relationship between the detected acoustic and articulatory gestures in terms of the timing as well as the shape is studied. Acoustic-to-articulatory inversion is also performed using a GMM-based regression, in order to study this relationship further. The accuracy of predicting of the articulatory trajectories from the acoustic waveform are at par with state-of-the-art frame-based methods with dynamical constraints (with an average error of 1.45-1.55 mm for the two speakers in the database). In order to evaluate the acoustic-to-articulatory inversion in a more intuitive manner, a method based on the error in estimated critical points is suggested. Using this method, it was noted that the estimated articulatory trajectories using the acoustic-to-articulatory inversion methods were still not accurate enough to be within the perceptual tolerance of audio-visual asynchrony.
Fichier principal
Vignette du fichier
PEER_stage2_10.1016%2Fj.specom.2011.01.009.pdf (3.4 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00727161 , version 1 (03-09-2012)

Identifiants

Citer

G. Ananthakrishnan, Olov Engwall. Mapping between Acoustic and Articulatory Gestures. Speech Communication, 2011, 53 (4), pp.567. ⟨10.1016/j.specom.2011.01.009⟩. ⟨hal-00727161⟩

Collections

PEER
57 Consultations
166 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More