F. Kelly, A. Alexander, O. Forth, S. Kent, J. Lindh et al., Identifying perceptually similar voices with a speaker recognition system using autophonetic features, INTERSPEECH, pp.1567-1568, 2016.

C. Zhang and T. Tan, Voice disguise and automatic speaker recognition, Forensic science international, vol.175, issue.2, pp.118-122, 2008.

J. Lindh and A. Eriksson, Voice similarity-a comparison between judgements by human listeners and automatic voice comparison, Proceedings from FONETIK, pp.63-69, 2010.

J. Laver, The phonetic description of voice quality, Cambridge Studies in Linguistics London, vol.31, pp.1-186, 1980.

K. Mcdougall, Assessing perceived voice similarity using multidimensional scaling for the construction of voice parades, International Journal of Speech, vol.20, issue.2, 2013.

P. Rose, Differences and distinguishability in the acoustic characteristics of hello in voices of similar-sounding speakers, Australian Review of Applied Linguistics, vol.22, issue.1, pp.1-42, 1999.

D. Loakes, A forensic phonetic investigation into the speech patterns of identical and non-identical twins, 2006.

F. Nolan, P. French, K. Mcdougall, L. Stevens, and T. Hudson, The role of voice quality settings in perceived voice similarity, International Association for Forensic Phonetics and Acoustics, 2011.

O. Baumann and P. Belin, Perceptual scaling of voice identity: common dimensions for different vowels and speakers, Psychological Research PRPF, vol.74, issue.1, p.110, 2010.

Y. Ijima and H. Mizuno, Similar speaker selection technique based on distance metric learning using highly correlated acoustic features with perceptual voice quality similarity, IEICE TRANSACTIONS on Information and Systems, vol.98, issue.1, pp.157-165, 2015.

E. S. Segundo and J. A. Mompean, A simplified vocal profile analysis protocol for the assessment of voice quality and speaker similarity, Journal of Voice, vol.31, issue.5, pp.644-655, 2017.

N. Obin, A. Roebel, and G. Bachman, On automatic voice casting for expressive speech: Speaker recognition vs. speech classification, Acoustics, Speech and Signal Processing, pp.950-954, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00943796

N. Obin and A. Roebel, Similarity search of acted voices for automatic voice casting, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.9, pp.1642-1651, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01464715

A. Gresse, M. Rouvier, R. Dufour, V. Labatut, and J. Bonastre, Acoustic pairing of original and dubbed voices in the context of video game localization, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01572151

J. Bromley, I. Guyon, Y. Lecun, E. Säckinger, and R. Shah, Signature verification using a" siamese" time delay neural network, Advances in Neural Information Processing Systems, pp.737-744, 1994.

S. Chopra, R. Hadsell, and Y. Lecun, Learning a similarity metric discriminatively, with application to face verification, Computer Vision and Pattern Recognition, vol.1, pp.539-546, 2005.

R. Hadsell, S. Chopra, and Y. Lecun, Dimensionality reduction by learning an invariant mapping, Computer vision and pattern recognition, vol.2, pp.1735-1742, 2006.

G. Koch, R. Zemel, and R. Salakhutdinov, Siamese neural networks for one-shot image recognition, ICML Deep Learning Workshop, vol.2, 2015.

N. Zeghidour, G. Synnaeve, N. Usunier, and E. Dupoux, Joint learning of speaker and phonetic similarities with siamese networks, INTERSPEECH, pp.1295-1299, 2016.

N. Dehak, J. Patrick, R. Kenny, P. Dehak, P. Dumouchel et al., Front-end factor analysis for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing, vol.19, issue.4, pp.788-798, 2011.

A. Kumar-sarkar, J. Bonastre, and D. Matrouf, A study on the roles of total variability space and session variability modeling in speaker recognition, International Journal of Speech Technology, vol.19, issue.1, pp.111-120, 2016.

D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek et al., The kaldi speech recognition toolkit, IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, p.11, 2011.

F. Chollet, Keras, 2015.

X. Glorot and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp.249-256, 2010.