Learning noise robust ResNet-based speaker embedding for speaker recognition

Mohammad Mohammadamini; Driss Matrouf; Jean-François Bonastre; Sandipana Dowerah; Romain Serizel; Denis Jouvet

Communication Dans Un Congrès Année : 2022

Learning noise robust ResNet-based speaker embedding for speaker recognition

(1) , , , , ,

Mohammad Mohammadamini

Fonction : Auteur
PersonId : 1133019

Laboratoire Informatique d'Avignon

Driss Matrouf

Fonction : Auteur
PersonId : 1133020

Jean-François Bonastre

Fonction : Auteur

Sandipana Dowerah

Fonction : Auteur

Romain Serizel

Fonction : Auteur
PersonId : 1125745

Denis Jouvet

Fonction : Auteur
PersonId : 15904
IdHAL : denis-jouvet
IdRef : 029418666

Résumé

The presence of background noise and reverberation, especially in far distance speech utterances diminishes the performance of speaker recognition systems. This challenge is addressed on different levels from the signal level in the front end to the scoring technique adaptation in the back end. In this paper, two new variants of ResNet-based speaker recognition systems are proposed that make the speaker embedding more robust against additive noise and reverberation. The goal of the proposed systems is to extract x-vectors in noisy environments that are close to their corresponding x-vector in a clean environment. To do so, the speaker embedding network minimizes the speaker classification loss function and the distance between pairs of noisy and clean x-vectors jointly. The experimental results obtained by our systems are compared with the baseline ResNet system. In different situations with real and simulated noises and reverberation conditions, the modified systems outperform the baseline ResNet system. The proposed systems are tested with four evaluation protocols. In the presence of artificial noise and reverberation, we achieved 19% improvement of EER. The main advantage of the proposed systems is their efficiency against real noise and reverberation. In the presence of real noise and reverberation, we achieved 15% improvement of EER.

Mots clés

Speaker recognition ResNet Additive noise Reverberation Robustness

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

Learning noise robust ResNet-based speaker embedding for speaker recognition.pdf (192.9 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Mohammad Mohammadamini : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03650549

Soumis le : lundi 25 avril 2022-10:14:23

Dernière modification le : vendredi 6 mai 2022-03:46:31

Archivage à long terme le : mardi 26 juillet 2022-18:37:30

Dates et versions

hal-03650549 , version 1 (25-04-2022)

Identifiants

HAL Id : hal-03650549 , version 1

Citer

Mohammad Mohammadamini, Driss Matrouf, Jean-François Bonastre, Sandipana Dowerah, Romain Serizel, et al.. Learning noise robust ResNet-based speaker embedding for speaker recognition. Odyssey 2022 : The Speaker and Language Recognition Workshop, Jun 2022, Beijing, China. ⟨hal-03650549⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON LIA

259 Consultations

460 Téléchargements

Learning noise robust ResNet-based speaker embedding for speaker recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager