A. W. Black, H. Zen, and K. Tokuda, Statistical Parametric Speech Synthesis, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07, pp.1229-1232, 2007.
DOI : 10.1109/ICASSP.2007.367298

A. J. Hunt and A. W. Black, Unit selection in a concatenative speech synthesis system using a large speech database, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp.373-376, 1996.
DOI : 10.1109/ICASSP.1996.541110

D. Lolive, P. Alain, N. Barbot, J. Chevelu, G. Lecorvé et al., The irisa text-to-speech system for the Blizzard challenge 2017, Proceedings of the Blizzard Challenge Workshop, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01662361

T. Merritt, R. A. Clark, Z. Wu, J. Yamagishi, and S. King, Deep neural networkguided unit selection synthesis, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5145-5149, 2016.

M. Morise, F. Yokomori, and K. Ozawa, WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications, IEICE Transactions on Information and Systems, vol.99, issue.7, pp.1877-1884, 2016.
DOI : 10.1587/transinf.2015EDP7457

A. Van-den-oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals et al., Wavenet: A generative model for raw audio, Proceedings of the ISCA Speech Synthesis Workshop (SSW), pp.125-125, 2016.

A. Perquin, Big deep voice: indexation de données massives de parole grâcè a des réseaux de neurones profonds, University of Rennes, vol.1, 2017.

V. Wan, Y. Agiomyrgiannakis, H. Silen, and J. Vit, Googles next-generation real-time unit-selection synthesizer using sequence-to-sequence lstm-based autoencoders, Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech, pp.1143-1147, 2017.

Y. Wang, R. J. Skerry-ryan, D. Stanton, Y. Wu, R. J. Weiss et al., Tacotron: Towards End-to-End Speech Synthesis, Interspeech 2017, pp.4006-4010, 2017.
DOI : 10.21437/Interspeech.2017-1452

Z. Wu and S. King, Improving Trajectory Modelling for DNN-Based Speech Synthesis by Using Stacked Bottleneck Features and Minimum Generation Error Training, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.7, pp.1255-1265, 2016.
DOI : 10.1109/TASLP.2016.2551865

Z. Wu, O. Watts, and S. King, Merlin: An Open Source Neural Network Speech Synthesis System, 9th ISCA Speech Synthesis Workshop, pp.218-223, 2016.
DOI : 10.21437/SSW.2016-33

Z. J. Yan, Y. Qian, and F. K. Soong, RIch-context Unit Selection (RUS) approach to high quality TTS, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4798-4801, 2010.
DOI : 10.1109/ICASSP.2010.5495150

H. Ze, A. Senior, and M. Schuster, Statistical parametric speech synthesis using deep neural networks, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.7962-7966, 2013.
DOI : 10.1109/ICASSP.2013.6639215