Iterative PLDA Adaptation for Speaker Diarization

Abstract : This paper investigates iterative PLDA adaptation for crossshow speaker diarization applied to small collections of French TV archives based on an i-vector framework. Using the target collection itself for unsupervised adaptation, PLDA parameters are iteratively tuned while score normalization is applied for convergence. Performances are compared, using combinations of target and external data for training and adaptation. The experiments on two distinct target corpora show that the proposed framework can gradually improve an existing system trained on external annotated data. Such results indicate that performing speaker diarization on small collections of unlabeled audio archives should only rely on the availability of a sufficient bootstrap system, which can be incrementally adapted to every target collection. The proposed framework also widens the range of acceptable speaker clustering thresholds for a given performance objective. Index Terms: speaker diarization, PLDA, unsupervised training, domain adaptation, iterative training
Type de document :
Communication dans un congrès
Interspeech 2016, Sep 2016, San Francisco, United States. 〈10.21437/Interspeech.2016-572〉
Liste complète des métadonnées
Contributeur : Anthony Larcher <>
Soumis le : lundi 19 novembre 2018 - 10:13:28
Dernière modification le : mercredi 21 novembre 2018 - 16:20:12
Document(s) archivé(s) le : mercredi 20 février 2019 - 13:03:52


Fichiers produits par l'(les) auteur(s)




Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier. Iterative PLDA Adaptation for Speaker Diarization. Interspeech 2016, Sep 2016, San Francisco, United States. 〈10.21437/Interspeech.2016-572〉. 〈hal-01818406〉



Consultations de la notice


Téléchargements de fichiers