A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking

Abstract : This paper investigates a novel neural scoring method, based on conventional i-vectors, to perform speaker diarization and linking of large collections of recordings. Using triplet loss for training, the network projects i-vectors in a space that better separates speakers in terms of cosine similarity. Experiments are run on two French TV collections built from REPERE [1] and ETAPE [2] campaigns corpora, the system being trained on French Radio data. Results indicate that the proposed approach outperforms conventional cosine and Probabilistic Linear Discriminant Analysis scoring methods on both within-and cross-recording diarization tasks, with a Diarization Error Rate reduction of 14% in average.
Type de document :
Communication dans un congrès
Interspeech 2017, Aug 2017, Stockholm, Sweden. ISCA, Annual conference of the International speech communication Association (Interspeech), 〈10.21437/Interspeech.2017-270〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01818401
Contributeur : Anthony Larcher <>
Soumis le : lundi 19 novembre 2018 - 09:26:45
Dernière modification le : mercredi 21 novembre 2018 - 16:02:29

Fichier

270_Paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier. A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking. Interspeech 2017, Aug 2017, Stockholm, Sweden. ISCA, Annual conference of the International speech communication Association (Interspeech), 〈10.21437/Interspeech.2017-270〉. 〈hal-01818401〉

Partager

Métriques

Consultations de la notice

49

Téléchargements de fichiers

7