Introduction to voice presentation attack detection and recent advances, Book chapter N15 of, Handbook of Biometric Anti-Spoofing: Presentation Attack Detection ,
,
, , 2018.
An experimental study of speaker verification sensitivity to computer voice-altered imposters, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258), vol.2, pp.837-840, 1999. ,
Information technology -biometric presentation attack detection, 2016. ,
Spoofing and countermeasures for automatic speaker verification, Proc. Interspeech, Annual Conf. of the Int, pp.925-929, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-01880306
ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge, Proc. Interspeech, Annual Conf. of the Int, pp.2037-2041, 2015. ,
The ASVspoof 2017 challenge: Assessing the limits of replay spoofing attack detection, Proc. Interspeech, Annual Conf. of the Int. Speech Comm. Assoc, pp.2-6, 2017. ,
, ASVspoof 2019: the automatic speaker verification spoofing and countermeasures challenge evaluation plan
, Asvspoof 2019: Future horizons in spoofed and fake audio detection
URL : https://hal.archives-ouvertes.fr/hal-02172099
, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4779-4783, 2018.
, Wavenet: A generative model for raw audio
CSTR VCTK corpus: English multi-speaker corpus for CSTR voice cloning ,
t-DCF: a detection cost function for the tandem assessment of spoofing countermeasures and automatic speaker verification, Proc. Odyssey, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01880306
Signal estimation from modified short-time Fourier transform, IEEE Trans. ASSP, vol.32, issue.2, pp.236-243, 1984. ,
Statistical parametric speech synthesis using deep neural networks, Proc. ICASSP, pp.7962-7966, 2013. ,
, The English TTS system Flite+HTS engine, 2014.
An autoregressive recurrent mixture density network for parametric speech synthesis, Proc. ICASSP, pp.4895-4899, 2017. ,
, Tutorial on variational autoencoders
Supervised Sequence Labelling with Recurrent Neural Networks, 2008. ,
A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis, Proc. ICASSP, pp.4804-4808, 2018. ,
WORLD: A vocoder-based highquality speech synthesis system for real-time applications, IEICE Trans. on Information and Systems, vol.99, issue.7, pp.1877-1884, 2016. ,
Merlin: An open source neural network speech synthesis system, Speech synthesis workshop SSW 2016, 2016. ,
An example of context-dependent label format for HMM-based speech synthesis in Japanese, 2015. ,
Open source voice creation toolkit for the MARY TTS platform, pp.3253-3256, 2011. ,
Creating new language and voice components for the updated MaryTTS text-to-speech synthesis platform, 11th Language Resources and Evaluation Conference (LREC), pp.3171-3175, 2018. ,
Voice conversion from non-parallel corpora using variational auto-encoder, p.2016 ,
, Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp.1-6, 2016.
Voice conversion based on cross-domain features using variational auto encoders, 2018 11th International Symposium on Chinese Spoken Language Processing, pp.51-55, 2018. ,
,
Effect of speech transformation on impostor acceptance, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, vol.1, 2006. ,
URL : https://hal.archives-ouvertes.fr/hal-01318472
Wavecyclegan2: Timedomain neural post-filter for speech waveform generation ,
Neural source-filter-based waveform model for statistical parametric speech synthesis, ICASSP 2019 -2019 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5916-5920, 2019. ,
Szczepaniak, Fast, compact, and high quality lstm-rnn based statistical parametric speech synthesizers for mobile devices, Interspeech, vol.2016, pp.2273-2277, 2016. ,
Vocaine the vocoder and applications in speech synthesis, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4230-4234, 2015. ,
Transfer learning from speaker verification to multispeaker text-to-speech synthesis, Advances in Neural Information Processing Systems, pp.4480-4490, 2018. ,
Generalized end-to-end loss for speaker verification, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4879-4883, 2018. ,
,
Signal estimation from modified short-time Fourier transform, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.32, issue.2, pp.236-243, 1984. ,
Generative moment matching networks, International Conference on Machine Learning, pp.1718-1727, 2015. ,
Intra-gender statistical singing voice conversion with direct waveform modification using log-spectral differential, Speech Communication, vol.99, pp.211-220, 2018. ,
WaveNet vocoder with limited training data for voice conversion, Annual Conference of the International Speech Communication Association, pp.1983-1987, 2018. ,
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency based F0 extraction: Possible role of a repetitive structure in sounds, Speech Communication, vol.27, issue.34, pp.187-207, 1999. ,
URL : https://hal.archives-ouvertes.fr/hal-01105608
Generalization of spectrum differential based direct waveform modification for voice conversion, Proc. SSW10, 2019. ,
A spoofing benchmark for the 2018 voice conversion challenge: Leveraging from spoofing countermeasures for speech artifact assessment, Proc. Odyssey 2018 The Speaker and Language Recognition Workshop, pp.187-194, 2018. ,
An overlap-add technique based on waveform similarity (wsola) for high quality time-scale modification of speech, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.2, pp.554-557, 1993. ,
Mel-generalized cepstral analysis-a unified approach to speech spectral estimation, Third International Conference on Spoken Language Processing, 1994. ,
Non-parallel voice conversion using i-vector plda: towards unifying speaker verification and transformation, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5535-5539, 2017. ,
Front-end factor analysis for speaker verification, vol.19, pp.788-798, 2011. ,
A small footprint i-vector extractor, Proc. Odyssey 2012: the Speaker and Language Recognition Workshop, 2012. ,
Probabilistic linear discriminant analysis for inferences about identity, IEEE 11th International Conference on Computer Vision, pp.1-8, 2007. ,
Bayesian speaker verification with heavy-tailed priors, Odyssey 2010: The Speaker and Language Recognition Workshop, p.14, 2010. ,
Xvectors: Robust DNN embeddings for speaker recognition ,
Within-class covariance normalization for svm-based speaker recognition, vol.3, 2006. ,
Visualizing data using t-{SNE}, Journal of Machine Learning Research, vol.9, pp.2579-2605, 2008. ,
ASVspoof: the automatic speaker verification spoofing and countermeasures challenge, IEEE Journal of Selected Topics in Signal Processing, vol.11, issue.4, pp.588-604, 2017. ,
An assessment of automatic speaker verification vulnerabilities to replay spoofing attacks, Security and Communication Networks, vol.9, pp.3030-3044, 2016. ,
Sound Reproduction: Loudspeakers and Rooms, Audio Engineering Society Presents Series, 2008. ,
Image Method for Efficiently Simulating Small-Room Acoustics, J. Acoust. Soc. Am, vol.65, issue.4, pp.943-950, 1979. ,
, , 2008.
Xvectors: Robust DNN embeddings for speaker recognition, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp.5329-5333, 2018. ,
A study on data augmentation of reverberant speech for robust speech recognition, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp.5220-5224, 2017. ,
Probabilistic linear discriminant analysis for inferences about identity, 2007 IEEE 11th International Conference on Computer Vision, pp.1-8, 2007. ,
The Kaldi speech recognition toolkit, IEEE Signal Processing Society, 2011. ,
, Voxceleb: a large-scale speaker identification dataset
On robustness of unsupervised domain adaptation for speaker recognition, pp.2958-2962, 2019. ,
Probabilistic linear discriminant analysis, European Conference on Computer Vision, pp.531-542, 2006. ,
A new feature for automatic speaker verification anti-spoofing: Constant Q cepstral coefficients, Proc. Odyssey, 2016. ,
Constant Q cepstral coefficients: A spoofing countermeasure for automatic speaker verification, Computer Speech & Language, vol.45, pp.516-535, 2017. ,
A comparison of features for synthetic speech detection, Proc. Interspeech, Annual Conf. of the Int, pp.2087-2091, 2015. ,
Anti-spoofing for text-independent speaker verification: An initial database, comparison of countermeasures, and human performance, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.4, pp.768-783, 2016. ,
Bias and statistical significance in evaluating speech synthesis with Mean Opinion Scores, pp.3976-3980, 2017. ,