Acoustic-to-articulatory inversion in speech based on statistical models - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

Acoustic-to-articulatory inversion in speech based on statistical models

Résumé

Two speech inversion methods are implemented and compared. In the first, multistream Hidden Markov Models (HMMs) of phonemes are jointly trained from synchronous streams of articulatory data acquired by EMA and speech spectral parameters; an acoustic recognition system uses the acoustic part of the HMMs to deliver a phoneme chain and the states durations; this information is then used by a trajectory formation procedure based on the articulatory part of the HMMs to resynthesise the articulatory movements. In the second, Gaussian Mixture Models (GMMs) are trained on these streams to directly associate articulatory frames with acoustic frames in context, using Maximum Likelihood Estimation. Over a corpus of 17 minutes uttered by a French speaker, the RMS error was 1.62 mm with the HMMs and 2.25 mm with the GMMs.
Fichier non déposé

Dates et versions

hal-00508279 , version 1 (02-08-2010)

Identifiants

  • HAL Id : hal-00508279 , version 1

Citer

Atef Ben Youssef, Pierre Badin, Gérard Bailly. Acoustic-to-articulatory inversion in speech based on statistical models. AVSP 2010 - 9th International Conference on Auditory-Visual Speech Processing, Sep 2010, Hakone, Kanagawa, Japan. pp.S8-3. ⟨hal-00508279⟩
170 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More