Articulatory Speech Synthesis from Static Context-Aware Articulatory Targets

Anastasiia Tsukanova 1 Benjamin Elie 2, 1 Yves Laprie 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : The aim of this work is to develop an algorithm for controlling the articulators (the jaw, the tongue, the lips, the velum, the larynx and the epiglottis) to produce given speech sounds, syllables and phrases. This control has to take into account coarticulation and be flexible enough to be able to vary strategies for speech production. The data for the algorithm are 97 static MRI images capturing the articulation of French vowels and blocked consonant-vowel syllables. The results of this synthesis are evaluated visually, acoustically and perceptually, and the problems encountered are broken down by their origin: the dataset, its modeling, the algorithm for managing the vocal tract shapes, their translation to the area functions, and the acoustic simulation. We conclude that, among our test examples, the articulatory strategies for vowels and stops are most correct, followed by those of nasals and fricatives. Improving timing strategies with dynamic data is suggested as an avenue for future work.
Type de document :
Chapitre d'ouvrage
Qiang Fang; Jianwu Dang; Pascal Perrier; Jianguo Wei; Longbiao Wang; Nan Yan. Studies on Speech Production, Springer, pp.37-47, 2018, Lecture Notes in Computer Science, 978-3-030-00125-4. 〈10.1007/978-3-030-00126-1_4〉. 〈https://link.springer.com/chapter/10.1007%2F978-3-030-00126-1_4〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01937950
Contributeur : Anastasiia Tsukanova <>
Soumis le : vendredi 7 décembre 2018 - 12:07:39
Dernière modification le : mardi 18 décembre 2018 - 16:38:02
Document(s) archivé(s) le : vendredi 8 mars 2019 - 13:09:54

Fichier

issp25-tsukanova.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Anastasiia Tsukanova, Benjamin Elie, Yves Laprie. Articulatory Speech Synthesis from Static Context-Aware Articulatory Targets. Qiang Fang; Jianwu Dang; Pascal Perrier; Jianguo Wei; Longbiao Wang; Nan Yan. Studies on Speech Production, Springer, pp.37-47, 2018, Lecture Notes in Computer Science, 978-3-030-00125-4. 〈10.1007/978-3-030-00126-1_4〉. 〈https://link.springer.com/chapter/10.1007%2F978-3-030-00126-1_4〉. 〈hal-01937950〉

Partager

Métriques

Consultations de la notice

70

Téléchargements de fichiers

40