Skip to Main content Skip to Navigation
Journal articles

An Adaptive Method for Cross-Recording Speaker Diarization

Abstract : Nowadays, state-of-the-art speaker diarization and linking systems heavily rely on between-recording variability compensation methods to accurately process large collections of recordings. Variability estimation is performed on consequent training datasets, which must be labeled by speaker. One major problem of such systems is the acoustic mismatch between training and target data that degrades performances. Most of the collections contain lots of speakers speaking in various acoustic conditions. In this paper, we investigate how unlabeled speakers can help improve between-recording variability estimation, to overcome the mismatch issue. We propose a scalable unsupervised adaptation framework for two types of variability compensation. The proposed framework consists in adapting a state-of-the-art diarization and linking system, trained on out-domain data, using the data of the collection itself. Experiments in mismatch condition are run on two French Television shows, while the initial training dataset is composed of Radio recordings. Results indicate that the proposed adaptation framework reduces the cross-recording DER of 13% in average for variable collection sizes.
Document type :
Journal articles
Complete list of metadata
Contributor : anthony larcher Connect in order to contact the contributor
Submitted on : Wednesday, November 28, 2018 - 8:27:20 PM
Last modification on : Tuesday, May 17, 2022 - 11:18:02 AM


Files produced by the author(s)




Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier. An Adaptive Method for Cross-Recording Speaker Diarization. IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2018, 14, pp.1-12. ⟨10.1109/TASLP.2018.2844025⟩. ⟨hal-01818495⟩



Record views


Files downloads