N. Obin, A. Roebel, and G. Bachman, On automatic voice casting for expressive speech: Speaker recognition vs. speech classification, International Conference on Acoustics, Speech and Signal Processing, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00943796

N. Obin and A. , Similarity search of acted voices for automatic voice casting, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, pp.1642-1651, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01464715

A. Gresse, M. Rouvier, R. Dufour, V. Labatut, and J. Bonastre, Acoustic pairing of original and dubbed voices in the context of video game localization, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01572151

A. Gresse, M. Quillot, R. Dufour, V. Labatut, and J. Bonastre, Similarity metric based on siamese neural networks for voice casting, International Conference on Acoustics, Speech and Signal Processing, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02004762

E. Variani, X. Lei, E. Mcdermott, I. L. Moreno, and J. Gonzalez-dominguez, Deep neural networks for small footprint textdependent speaker verification, International Conference on Acoustics, Speech and Signal Processing, 2014.

D. Snyder, P. Ghahremani, D. Povey, D. Garcia-romero, Y. Carmiel et al., Deep neural network-based speaker embeddings for end-to-end speaker verification, Spoken Language Technology Workshop (SLT), 2016.

D. Snyder, D. Garcia-romero, D. Povey, and S. Khudanpur, Deep neural network embeddings for text-independent speaker verification, INTERSPEECH, 2017.

D. Snyder, D. Garcia-romero, G. Sell, D. Povey, and S. Khudanpur, X-vectors: Robust dnn embeddings for speaker recognition, International Conference on Acoustics, Speech and Signal Processing (ICASSP, 2018.

Y. Bengio, A. Courville, and P. Vincent, Representation learning: A review and new perspectives, IEEE transactions on pattern analysis and machine intelligence, vol.35, pp.1798-1828, 2013.

D. Lopez-paz, L. Bottou, B. Schölkopf, and V. Vapnik, Unifying distillation and privileged information, International Conference on Learning Representations, 2016.

V. Vapnik and R. Izmailov, Learning using privileged information: similarity control and knowledge transfer, Journal of machine learning research, vol.16, pp.2023-2049, 2015.

G. Hinton, O. Vinyals, and J. Dean, Distilling the knowledge in a neural network, 2015.

R. Price, K. Iso, and K. Shinoda, Wise teachers train better dnn acoustic models, EURASIP Journal on Audio, Speech, and Music Processing, vol.2016, 2016.

K. Markov and T. Matsui, Robust speech recognition using generalized distillation framework, INTERSPEECH, 2016.

J. Li, M. L. Seltzer, X. Wang, R. Zhao, and Y. Gong, Large-scale domain adaptation via teacher-student learning, 2017.

S. Watanabe, T. Hori, J. L. Roux, and J. R. Hershey, Studentteacher network learning with enhanced features, Acoustics, Speech and Signal Processing, 2017.

T. Asami, R. Masumura, Y. Yamaguchi, H. Masataki, and Y. Aono, Domain adaptation of dnn acoustic models using knowledge distillation, International Conference on Acoustics, Speech and Signal Processing, 2017.

N. M. Joy, S. R. Kothinti, S. Umesh, and B. Abraham, Generalized distillation framework for speaker normalization, 2017.

D. Povey, A. Ghoshal, G. Boulianne, L. Burget, and O. Glembek, The kaldi speech recognition toolkit, IEEE 2011 workshop on automatic speech recognition and understanding, 2011.

J. S. Chung, A. Nagrani, and A. Zisserman, Voxceleb2: Deep speaker recognition, in INTERSPEECH, 2018.

F. Chollet, Keras, 2015.

X. Glorot and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010.