Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition

Abstract : Once the i-vector paradigm has been introduced in the field of speaker recognition , many techniques have been proposed to deal with additive noise within this framework. Due to the complexity of its effect in the i-vector space, a lot of effort has been put into dealing with noise in other domains (speech enhancement , feature compensation, robust i-vector extraction and robust scoring). As far as we know, there was no serious attempt to handle the noise problem directly in the i-vector space without relying on data distributions computed on a prior domain. The aim of this paper is twofold. First, it proposes a full-covariance Gaussian modeling of the clean i-vectors and noise distribution in the i-vector space and introduces a technique to estimate a clean i-vector given the noisy version and the noise density function using the MAP approach. Based on NIST data, we show that it is possible to improve by up to 60% the baseline system performance. Second, in order to make this algorithm usable in a real application and reduce the computational time needed by i-MAP, we propose an extension that requires building a noise distribution database in the i-vector space in an off-line step and using it later in the test phase. We show that it is possible to achieve comparable results using this approach (up to 57% of relative EER improvement) with a sufficiently large noise distribution database.
Complete list of metadatas

Cited literature [39 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02157200
Contributor : Jean-François Bonastre <>
Submitted on : Saturday, June 15, 2019 - 4:54:06 PM
Last modification on : Tuesday, July 2, 2019 - 5:38:02 PM

File

papier__csl.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02157200, version 1

Collections

Citation

Waad Kheder, Driss Matrouf, Pierre-Michel Bousquet, Jean-François Bonastre, Moez Ajili. Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition. Computer Speech and Language, Elsevier, 2017. ⟨hal-02157200⟩

Share

Metrics

Record views

10

Files downloads

17