Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Computer Speech and Language Année : 2017

Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition

Résumé

Once the i-vector paradigm has been introduced in the field of speaker recognition , many techniques have been proposed to deal with additive noise within this framework. Due to the complexity of its effect in the i-vector space, a lot of effort has been put into dealing with noise in other domains (speech enhancement , feature compensation, robust i-vector extraction and robust scoring). As far as we know, there was no serious attempt to handle the noise problem directly in the i-vector space without relying on data distributions computed on a prior domain. The aim of this paper is twofold. First, it proposes a full-covariance Gaussian modeling of the clean i-vectors and noise distribution in the i-vector space and introduces a technique to estimate a clean i-vector given the noisy version and the noise density function using the MAP approach. Based on NIST data, we show that it is possible to improve by up to 60% the baseline system performance. Second, in order to make this algorithm usable in a real application and reduce the computational time needed by i-MAP, we propose an extension that requires building a noise distribution database in the i-vector space in an off-line step and using it later in the test phase. We show that it is possible to achieve comparable results using this approach (up to 57% of relative EER improvement) with a sufficiently large noise distribution database.
Fichier principal
Vignette du fichier
papier__csl.pdf (644.56 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02157200 , version 1 (15-06-2019)

Identifiants

  • HAL Id : hal-02157200 , version 1

Citer

Waad Ben Kheder, Driss Matrouf, Pierre-Michel Bousquet Bousquet, Jean-François Bonastre, Moez Ajili. Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition. Computer Speech and Language, 2017. ⟨hal-02157200⟩

Collections

UNIV-AVIGNON LIA
69 Consultations
175 Téléchargements

Partager

Gmail Facebook X LinkedIn More