. Baumann-t and . Schlangen-d, Inpro_iss : A component for just-in-time incremental speech synthesis, Proceedings of the ACL 2012 System Demonstrations, 2012.

M. Cernak and . N. Motlicek-p.-&-garner-p, On the (un) importance of the contextual factors in hmm-based speech synthesis and coding, 2013.

C. M. Steiner-i, Expressive speech synthesis in MARY TTS using audiobook data and emotionML, INTERSPEECH, 2013.

. Dahmani-s, V. Colotte, and . Girard-v.-&-ouni-s, Conditional variational auto-encoder for text-driven expressive audiovisual speech synthesis, Proc. Interspeech, 2019.

F. Eyben and . Buchholz, Unsupervised clustering of emotion and voice styles for expressive TTS, ICASSP 2012, 2012.

F. B. , Photo-real talking head with deep bidirectional lstm, ICASSP, 2015.

F. Y. , Tts synthesis with bidirectional lstm based recurrent neural networks, 2014.

. P. Filntisis-p, . Katsamanis-a, and . Tsiakoulis-p.-&-maragos-p, Video-realistic expressive audio-visual speech synthesis for the greek language, Speech Communication, p.95, 2017.

. Houidhek-a, V. Colotte, and . Mnasri-z.-&-jouvet-d, Dnn-based speech synthesis for arabic : modelling and evaluation, International Conference on Statistical Language and Speech, 2018.

K. S. , Measuring a decade of progress in text-to-speech, Loquens, issue.1, p.1, 2014.

. Klimkov-v and . Moinet-a, Parameter generation algorithms for text-to-speech synthesis with recurrent neural networks, In SLT, 2018.

S. L. Le-maguer, . &. Barbot-n, and . Boeffard-o, Evaluation of contextual descriptors for hmm-based speech synthesis in french, Eighth ISCA Workshop on Speech Synthesis, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00987809

. Mametani-k, . &. Kato-t, and . Yamamoto-s, Investigating context features hidden in end-to-end tts, ICASSP 2019, 2019.

J. Ostermann and . Millen-d, Talking heads and synthetic speech : An architecture for supporting electronic commerce, 2000.

. S. Pandzic-i, J. &. Ostermann, and . Millen-d, User evaluation : Synthetic talking faces for interactive services, The visual computer, vol.15, pp.7-8, 1999.

. Pouget-m, Synthèse incrémentale de la parole à partir du texte, 2017.

. S. Ribeiro-m and . Watts-o.-&-yamagishi-j, Syllable-level representations of suprasegmental features for dnn-based text-to-speech synthesis, INTERSPEECH, 2016.

. Schabus-d and . Pucher-m.-&-hofer-g, Joint audiovisual hidden semi-markov modelbased speech synthesis, IEEE Journal of Selected Topics in Signal Processing, vol.8, issue.2, 2013.

. Sproull-l, When the interface is a face, 1996.

. Watts-o, The role of higher-level linguistic features in hmm-based speech synthesis, 2010.

. Wu-z, Merlin : An open source neural network speech synthesis system, SSW, 2016.

Y. K. , Word-level emphasis modelling in hmm-based speech synthesis, ICASSP, 2010.

H. Ze, Statistical parametric speech synthesis using deep neural networks, 2013.

. Zen-h.-&-senior-a, Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis, ICASSP, 2014.