Skip to Main content Skip to Navigation
Journal articles

An Adaptive Method for Cross-Recording Speaker Diarization

Abstract : Nowadays, state-of-the-art speaker diarization and linking systems heavily rely on between-recording variability compensation methods to accurately process large collections of recordings. Variability estimation is performed on consequent training datasets, which must be labeled by speaker. One major problem of such systems is the acoustic mismatch between training and target data that degrades performances. Most of the collections contain lots of speakers speaking in various acoustic conditions. In this paper, we investigate how unlabeled speakers can help improve between-recording variability estimation, to overcome the mismatch issue. We propose a scalable unsupervised adaptation framework for two types of variability compensation. The proposed framework consists in adapting a state-of-the-art diarization and linking system, trained on out-domain data, using the data of the collection itself. Experiments in mismatch condition are run on two French Television shows, while the initial training dataset is composed of Radio recordings. Results indicate that the proposed adaptation framework reduces the cross-recording DER of 13% in average for variable collection sizes.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01818495
Contributor : Anthony Larcher <>
Submitted on : Wednesday, November 28, 2018 - 8:27:20 PM
Last modification on : Friday, April 26, 2019 - 1:35:55 PM

File

journal_final.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier. An Adaptive Method for Cross-Recording Speaker Diarization. IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2018, 14, pp.1-12. ⟨10.1109/TASLP.2018.2844025⟩. ⟨hal-01818495⟩

Share

Metrics

Record views

120

Files downloads

216