Combining speaker identification and bic for speaker diarization

Abstract : This paper describes recent advances in speaker diarization by incorporating a speaker identification step. This system builds upon the LIMSI baseline data partitioner used in the broadcast news transcription system. This partitioner provides a high cluster purity but has a tendency to split the data from a speaker into several clusters, when there is a large quantity of data for the speaker. Several improvements to the baseline sys- tem have been made. Firstly, a standard Bayesian information criterion (BIC) agglomerative clustering has been integrated re- placing the iterative Gaussian mixture model (GMM) cluster- ing. Then a second clustering stage has been added, using a speaker identification method with MAP adapted GMM. A fi- nal post-processing stage refines the segment boundaries using the output of the transcription system. On the RT-04f and ES- TER evaluation data, the improved multi-stage system provides between 40% and 50% reduction of the speaker error, relative to a standard BIC clustering system.
Document type :
Conference papers
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01434281
Contributor : Sylvain Meignier <>
Submitted on : Wednesday, March 22, 2017 - 3:02:12 PM
Last modification on : Tuesday, September 17, 2019 - 1:13:01 AM
Long-term archiving on : Friday, June 23, 2017 - 1:20:19 PM

File

IS051821v2.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01434281, version 1

Collections

Citation

Xuan Zhu, Claude Barras, Sylvain Meignier, Jean-Luc Gauvain. Combining speaker identification and bic for speaker diarization. Interspeech'05, ISCA, 2005, Lisbon, Portugal. pp.4. ⟨hal-01434281⟩

Share

Metrics

Record views

85

Files downloads

197