Copy synthesis of running speech based on vocal tract imaging and audio recording - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

Copy synthesis of running speech based on vocal tract imaging and audio recording

Résumé

This study presents a simulation framework to synthesize running speech from information obtained from simultaneous vocat tract imaging and audio recording. The aim is to numerically simulate the acoustic and mechanical phenomena that occur during speech production given the actual articulatory gestures of the speaker, so that the simulated speech reproduces the original acoustic features (formant trajectories, prosody, segmentic phonation, etc). The result is intended to be a copy of the original speech signal, hence the name copy synthesis. The shape of the vocal tract is extracted from 2D midsagittal views of the vocal tract acquired at a sufficient framerate to get a few images per produced phone. The area functions of the vocal tract are then anatomically realistic, and also account for side cavities. The acoustic simulation framework uses an extended version of the single-matrix formulation that enables a self-oscillating model of the vocal folds with a glottal chink to be connected to the time-varying waveguide network that models the vocal tract. Copy synthesis of a few French sentences shows the accuracy of the simulation framework to reproduce acoustic cues of natural phrase-level utterances containing most of French natural classes while considering the real geometric shape of the speaker. This is intended to be used as a tool to relate the acoustic features of speech to their articulatory or phonatory origins.
Fichier principal
Vignette du fichier
ICA2016-0699.pdf (3.98 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01372310 , version 1 (27-09-2016)

Identifiants

  • HAL Id : hal-01372310 , version 1

Citer

Benjamin Elie, Yves Laprie. Copy synthesis of running speech based on vocal tract imaging and audio recording. 22nd International Congress on Acoustics (ICA), Sep 2016, Buenos Aires, Argentina. ⟨hal-01372310⟩
240 Consultations
147 Téléchargements

Partager

Gmail Facebook X LinkedIn More