Discriminant binary data representation for speaker recognition

Abstract : In supervector UBM/GMM paradigm, each acoustic file is represented by the mean parameters of a GMM model. This supervector space is used as a data representation space, which has a high di-mensionality. Moreover, this space is not intrinsically discriminant and a complete speech segment is represented by only one vector, withdrawing mainly the possibility to take into account temporal or sequential information. This work proposes a new approach where each acoustic frame is represented in a discriminant binary space. The proposed approach relies on a UBM to structure the acoustic space in regions. Each region is then populated with a set of Gaus-sian models, denoted as " specificities " , able to emphasize speaker specific information. Each acoustic frame is mapped in the discrim-inant binary space, turning " on " or " off " all the specificities to create a large binary vector. All the following steps, speaker reference extraction, likelihood estimation or decision take place in this binary space. Even if this work is a first step in this avenue, the experiments based on NIST SRE 2008 framework demonstrate the potential of the proposed approach. Moreover, this approach opens the opportunity to rethink all the classical processes using a discrete, binary view.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01317599
Contributor : Bibliothèque Universitaire Déposants Hal-Avignon <>
Submitted on : Wednesday, May 18, 2016 - 3:52:16 PM
Last modification on : Tuesday, July 2, 2019 - 5:38:02 PM

Identifiers

Collections

Citation

Jean-François Bonastre, Pierre-Michel Bousquet, Driss Matrouf. Discriminant binary data representation for speaker recognition. International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2011, Prague, Czech Republic. ⟨10.1109/ICASSP.2011.5947550⟩. ⟨hal-01317599⟩

Share

Metrics

Record views

71