Typicality extraction in a Speaker Binary Keys model

Abstract : In the field of speaker recognition, the recently proposed notion of "Speaker Binary Key" provides a representation of each acoustic frame in a discriminant binary space. This approach relies on an unique acoustic model composed by a large set of speaker specific local likelihood peaks (called specificities). The model proposes a spatial coverage where each frame is characterized in terms of neighborhood. The most frequent specificities, picked up to represent the whole utterance, generate a binary key vector. The flexibility of this modeling allows to capture non-parametric behaviors. In this paper, we introduce a concept of "typicality" between binary keys, with a discriminant goal. We describe an algorithm able to extract such typicalities, which involves a singular value decomposition in a binary space. The theoretical aspects of this decomposition as well as its potential in terms of future developments are presented. All the propositions are also experimentally validated using NIST SRE 2008 framework.
Pierre-Michel Bousquet, Jean-François Bonastre. Typicality extraction in a Speaker Binary Keys model. ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Mar 2012, Kyoto, France. pp.1713-1716, ⟨10.1109/ICASSP.2012.6288228⟩. ⟨hal-02159792⟩



