HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Learning noise robust ResNet-based speaker embedding for speaker recognition

Abstract : The presence of background noise and reverberation, especially in far distance speech utterances diminishes the performance of speaker recognition systems. This challenge is addressed on different levels from the signal level in the front end to the scoring technique adaptation in the back end. In this paper, two new variants of ResNet-based speaker recognition systems are proposed that make the speaker embedding more robust against additive noise and reverberation. The goal of the proposed systems is to extract x-vectors in noisy environments that are close to their corresponding x-vector in a clean environment. To do so, the speaker embedding network minimizes the speaker classification loss function and the distance between pairs of noisy and clean x-vectors jointly. The experimental results obtained by our systems are compared with the baseline ResNet system. In different situations with real and simulated noises and reverberation conditions, the modified systems outperform the baseline ResNet system. The proposed systems are tested with four evaluation protocols. In the presence of artificial noise and reverberation, we achieved 19% improvement of EER. The main advantage of the proposed systems is their efficiency against real noise and reverberation. In the presence of real noise and reverberation, we achieved 15% improvement of EER.
Document type :
Conference papers
Complete list of metadata

Contributor : Mohammad Mohammadamini Connect in order to contact the contributor
Submitted on : Monday, April 25, 2022 - 10:14:23 AM
Last modification on : Friday, May 6, 2022 - 3:46:31 AM


Learning noise robust ResNet-b...
Files produced by the author(s)


  • HAL Id : hal-03650549, version 1



Mohammad Mohammadamini, Driss Matrouf, Jean-François Bonastre, Sandipana Dowerah, Romain Serizel, et al.. Learning noise robust ResNet-based speaker embedding for speaker recognition. Odyssey 2022 : The Speaker and Language Recognition Workshop, Jun 2022, Beijing, China. ⟨hal-03650549⟩



Record views


Files downloads