Novel Clustering Selection Criterion for Fast Binary Key Speaker Diarization

Abstract : Speaker diarization has become an important building block in many speech-related systems. Given the great increase of audiovisual media, fast systems are required in order to process large amounts of data in a reasonable time. In this regard, the recently proposed speaker diarization system based on binary key speaker modeling provides a very fast alternative to state-of-the-art systems at the cost of a slight decrease in performance. This decrease is mainly due to drawbacks in the final clustering selection algorithm, which is far from returning the optimum clustering the system is actually able to generate. At the same time, we have identified potential points of our system which can be further sped up. This paper aims to face these two issues by first lightening the processing at the main identified bottleneck, and second by proposing an alternative clustering selection technique capable of providing near-optimum clustering outputs. Experimental results on the REPERE test database validate the effectiveness of the proposed improvements, obtaining a relative performance gain of 20% and execution times of 0.037 xRT (being xRT the Real-Time factor).
Document type :
Conference papers
Complete list of metadatas

Cited literature [13 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02102802
Contributor : Corinne Fredouille <>
Submitted on : Friday, April 19, 2019 - 11:59:22 AM
Last modification on : Saturday, April 20, 2019 - 1:38:20 AM

File

IS2015_Delgado.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-02102802, version 1

Collections

Citation

Héctor Delgado, Xavier Anguera, Corinne Fredouille, Javier Serrano. Novel Clustering Selection Criterion for Fast Binary Key Speaker Diarization. Interspeech 2015, Sep 2015, Dresden, Germany. ⟨hal-02102802⟩

Share

Metrics

Record views

17

Files downloads

10