Copy synthesis of running speech based on vocal tract imaging and audio recording

Benjamin Elie 1, * Yves Laprie 1
* Corresponding author
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This study presents a simulation framework to synthesize running speech from information obtained from simultaneous vocat tract imaging and audio recording. The aim is to numerically simulate the acoustic and mechanical phenomena that occur during speech production given the actual articulatory gestures of the speaker, so that the simulated speech reproduces the original acoustic features (formant trajectories, prosody, segmentic phonation, etc). The result is intended to be a copy of the original speech signal, hence the name copy synthesis. The shape of the vocal tract is extracted from 2D midsagittal views of the vocal tract acquired at a sufficient framerate to get a few images per produced phone. The area functions of the vocal tract are then anatomically realistic, and also account for side cavities. The acoustic simulation framework uses an extended version of the single-matrix formulation that enables a self-oscillating model of the vocal folds with a glottal chink to be connected to the time-varying waveguide network that models the vocal tract. Copy synthesis of a few French sentences shows the accuracy of the simulation framework to reproduce acoustic cues of natural phrase-level utterances containing most of French natural classes while considering the real geometric shape of the speaker. This is intended to be used as a tool to relate the acoustic features of speech to their articulatory or phonatory origins.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [21 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01372310
Contributor : Benjamin Elie <>
Submitted on : Tuesday, September 27, 2016 - 10:21:45 AM
Last modification on : Tuesday, December 18, 2018 - 4:38:02 PM
Document(s) archivé(s) le : Wednesday, December 28, 2016 - 1:05:58 PM

File

ICA2016-0699.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-01372310, version 1

Collections

Citation

Benjamin Elie, Yves Laprie. Copy synthesis of running speech based on vocal tract imaging and audio recording. 22nd International Congress on Acoustics (ICA), Sep 2016, Buenos Aires, Argentina. Proceedings of the 22th International Congress on Acoustics, 2016. 〈hal-01372310〉

Share

Metrics

Record views

351

Files downloads

168