N. Tomashenko, B. M. Srivastava, X. Wang, E. Vincent, A. Nautsch et al., Introducing the VoicePrivacy initiative, 2020.
URL : https://hal.archives-ouvertes.fr/hal-02562199

K. Hashimoto, J. Yamagishi, and I. Echizen, Privacy-preserving sound to degrade automatic speaker verification performance, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5500-5504, 2016.

J. Qian, H. Du, J. Hou, L. Chen, T. Jung et al., Voicemask: Anonymize and sanitize voice input on mobile devices, 2017.

Q. Jin, A. R. Toth, T. Schultz, and A. W. Black, Speaker deidentification via voice transformation, 2009 IEEE Workshop on Automatic Speech Recognition and Understanding, pp.529-533, 2009.

M. Pobar and I. Ip?i?, Online speaker de-identification using voice transformation, 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp.1264-1267, 2014.

F. Bahmaninezhad, C. Zhang, and J. H. Hansen, Convolutional neural network based speaker de-identification, pp.255-260, 2018.

T. Justin, V. ?truc, S. Dobri?ek, B. Vesnicer, I. Ip?i? et al., Speaker de-identification using diphone recognition and speech synthesis, 2015 IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol.4, pp.1-7, 2015.

F. Fang, X. Wang, J. Yamagishi, I. Echizen, M. Todisco et al., Speaker anonymization using X-vector and neural waveform models, Speech Synthesis Workshop, pp.155-160, 2019.

B. M. Srivastava, A. Bellet, M. Tommasi, and E. Vincent, Privacy-preserving adversarial representation learning in ASR: Reality or illusion?, in Interspeech, pp.3700-3704, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02166434

B. M. Srivastava, N. Vauquier, M. Sahidullah, A. Bellet, M. Tommasi et al., Evaluating voice conversion-based privacy protection against informed attackers, 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), p.2020
URL : https://hal.archives-ouvertes.fr/hal-02355115

D. Snyder, D. Garcia-romero, G. Sell, D. Povey, and S. Khudanpur, X-vectors: Robust DNN embeddings for speaker recognition, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5329-5333, 2018.

X. Wang, S. Takaki, and J. Yamagishi, Neural source-filter-based waveform model for statistical parametric speech synthesis, I2019 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5916-5920, 2019.

Y. Han, S. Li, Y. Cao, Q. Ma, and M. Yoshikawa, Voiceindistinguishability: Protecting voiceprint in privacy-preserving speech data release, 2020.

P. Kenny, Bayesian speaker verification with heavy-tailed priors, p.14, 2010.

I. Salmun, I. Opher, and I. Lapidot, On the use of PLDA i-vector scoring for clustering short segments, pp.407-414, 2016.

N. Tomashenko, B. M. Srivastava, X. Wang, E. Vincent, A. Nautsch et al., The VoicePrivacy 2020 Challenge evaluation plan, 2020.

S. Ioffe, Probabilistic linear discriminant analysis, European Conference on Computer Vision (ECCV), pp.531-542, 2006.

J. Rohdin, S. Biswas, and K. Shinoda, Constrained discriminative PLDA training for speaker verification, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.1670-1674, 2014.

D. Dueck, Affinity propagation: clustering data by passing messages, 2009.

A. Nagrani, J. S. Chung, and A. Zisserman, VoxCeleb: a largescale speaker identification dataset, pp.2616-2620, 2017.

J. S. Chung, A. Nagrani, and A. Zisserman, VoxCeleb2: Deep speaker recognition, in Interspeech, pp.1086-1090, 2018.

V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, Librispeech: an ASR corpus based on public domain audio books, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5206-5210, 2015.

H. Zen, V. Dang, R. Clark, Y. Zhang, R. J. Weiss et al., LibriTTS: A corpus derived from LibriSpeech for text-to-speech, pp.1526-1530, 2019.

D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek et al., The Kaldi speech recognition toolkit, Tech. Rep, 2011.

A. Nautsch, Speaker recognition in unconstrained environments, 2019.

N. Brummer, Measuring, refining and calibrating speaker and language information extracted from speech, 2010.

M. Gomez-barrero, J. Galbally, C. Rathgeb, and C. Busch, General framework to evaluate unlinkability in biometric template protection systems, IEEE Transactions on Information Forensics and Security, vol.13, issue.6, pp.1406-1420, 2017.