Skip to Main content Skip to Navigation
Journal articles

Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition

Abstract : Once the i-vector paradigm has been introduced in the field of speaker recognition , many techniques have been proposed to deal with additive noise within this framework. Due to the complexity of its effect in the i-vector space, a lot of effort has been put into dealing with noise in other domains (speech enhancement , feature compensation, robust i-vector extraction and robust scoring). As far as we know, there was no serious attempt to handle the noise problem directly in the i-vector space without relying on data distributions computed on a prior domain. The aim of this paper is twofold. First, it proposes a full-covariance Gaussian modeling of the clean i-vectors and noise distribution in the i-vector space and introduces a technique to estimate a clean i-vector given the noisy version and the noise density function using the MAP approach. Based on NIST data, we show that it is possible to improve by up to 60% the baseline system performance. Second, in order to make this algorithm usable in a real application and reduce the computational time needed by i-MAP, we propose an extension that requires building a noise distribution database in the i-vector space in an off-line step and using it later in the test phase. We show that it is possible to achieve comparable results using this approach (up to 57% of relative EER improvement) with a sufficiently large noise distribution database.
Complete list of metadatas

Cited literature [39 references]  Display  Hide  Download
Contributor : Jean-François Bonastre <>
Submitted on : Saturday, June 15, 2019 - 4:54:06 PM
Last modification on : Tuesday, January 14, 2020 - 10:38:07 AM


Files produced by the author(s)


  • HAL Id : hal-02157200, version 1



Waad Kheder, Driss Matrouf, Pierre-Michel Bousquet, Jean-François Bonastre, Moez Ajili. Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition. Computer Speech and Language, Elsevier, 2017. ⟨hal-02157200⟩



Record views


Files downloads