Copy synthesis of running speech based on vocal tract imaging and audio recording

Benjamin Elie 1, * Yves Laprie 1
* Auteur correspondant
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This study presents a simulation framework to synthesize running speech from information obtained from simultaneous vocat tract imaging and audio recording. The aim is to numerically simulate the acoustic and mechanical phenomena that occur during speech production given the actual articulatory gestures of the speaker, so that the simulated speech reproduces the original acoustic features (formant trajectories, prosody, segmentic phonation, etc). The result is intended to be a copy of the original speech signal, hence the name copy synthesis. The shape of the vocal tract is extracted from 2D midsagittal views of the vocal tract acquired at a sufficient framerate to get a few images per produced phone. The area functions of the vocal tract are then anatomically realistic, and also account for side cavities. The acoustic simulation framework uses an extended version of the single-matrix formulation that enables a self-oscillating model of the vocal folds with a glottal chink to be connected to the time-varying waveguide network that models the vocal tract. Copy synthesis of a few French sentences shows the accuracy of the simulation framework to reproduce acoustic cues of natural phrase-level utterances containing most of French natural classes while considering the real geometric shape of the speaker. This is intended to be used as a tool to relate the acoustic features of speech to their articulatory or phonatory origins.
Type de document :
Communication dans un congrès
22nd International Congress on Acoustics (ICA), Sep 2016, Buenos Aires, Argentina. Proceedings of the 22th International Congress on Acoustics, 2016
Liste complète des métadonnées

Littérature citée [21 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01372310
Contributeur : Benjamin Elie <>
Soumis le : mardi 27 septembre 2016 - 10:21:45
Dernière modification le : mardi 18 décembre 2018 - 16:38:02
Document(s) archivé(s) le : mercredi 28 décembre 2016 - 13:05:58

Fichier

ICA2016-0699.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : hal-01372310, version 1

Collections

Citation

Benjamin Elie, Yves Laprie. Copy synthesis of running speech based on vocal tract imaging and audio recording. 22nd International Congress on Acoustics (ICA), Sep 2016, Buenos Aires, Argentina. Proceedings of the 22th International Congress on Acoustics, 2016. 〈hal-01372310〉

Partager

Métriques

Consultations de la notice

348

Téléchargements de fichiers

134