Decoupling session variability modelling and speaker characterisation

Abstract : The Factor Analysis framework demonstrated its high power to model session variability during the past years. However, training the FA parameters implies to have a large amount of training data. When the size of the available database is limited, the number of components of the core statistical model, the UBM, is also limited as the UBM drives the dimension of the FA main matrix. As the size of the UBM gives directly the size of the speaker supervector (concatenation of the GMM mean parameters), it limits also the intrinsic capacity of the recognition system , reducing the performance expectation. This paper aims to withdraw this limitation by breaking the intrinsic link between the FA dimensionality and the UBM dimensionality. The session variability modelling is done on a smaller dimension compared to the UBM, which drives the discriminative power of the system. The first experimental results proposed in this paper, done using the NIST-SRE 2008 framework, are encouraging with a relative EER improvement of about 18% when a 512 components UBM is associated to a 32 components session variability modelling compared with a 32 components UBM associated with the same variability modelling.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01317698
Contributor : Bibliothèque Universitaire Déposants Hal-Avignon <>
Submitted on : Monday, November 19, 2018 - 10:31:20 AM
Last modification on : Tuesday, July 2, 2019 - 5:38:02 PM
Long-term archiving on : Wednesday, February 20, 2019 - 1:03:48 PM

File

Expanded_FA.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01317698, version 1

Collections

Citation

Anthony Larcher, Christophe Lévy, Driss Matrouf, Jean-François Bonastre. Decoupling session variability modelling and speaker characterisation. INTERSPEECH, Sep 2010, Makuhari, Japan. ⟨hal-01317698⟩

Share

Metrics

Record views

105

Files downloads

11