A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking

Résumé

This paper investigates a novel neural scoring method, based on conventional i-vectors, to perform speaker diarization and linking of large collections of recordings. Using triplet loss for training, the network projects i-vectors in a space that better separates speakers in terms of cosine similarity. Experiments are run on two French TV collections built from REPERE [1] and ETAPE [2] campaigns corpora, the system being trained on French Radio data. Results indicate that the proposed approach outperforms conventional cosine and Probabilistic Linear Discriminant Analysis scoring methods on both within-and cross-recording diarization tasks, with a Diarization Error Rate reduction of 14% in average.
Fichier principal
Vignette du fichier
270_Paper.pdf (2.87 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01818401 , version 1 (19-11-2018)

Identifiants

Citer

Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier. A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking. Interspeech 2017, Aug 2017, Stockholm, Sweden. ⟨10.21437/Interspeech.2017-270⟩. ⟨hal-01818401⟩
169 Consultations
150 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More