A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking

Gaël Le Lan; Delphine Charlet; Anthony Larcher; Sylvain Meignier

doi:10.21437/Interspeech.2017-270

Communication Dans Un Congrès Année : 2017

A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking

(1) , (1) , (2) , (2)

1
2

Gaël Le Lan

Fonction : Auteur
PersonId : 751878
IdHAL : gael-le-lan
ORCID : 0000-0002-1493-5777

Orange Labs [Lannion]

Delphine Charlet

Fonction : Auteur

Orange Labs [Lannion]

Anthony Larcher

Fonction : Auteur
PersonId : 20105
IdHAL : anthony-larcher
ORCID : 0000-0003-4398-0224
IdRef : 139544569

Laboratoire d'Informatique de l'Université du Mans

Sylvain Meignier

Fonction : Auteur
PersonId : 11674
IdHAL : sylvain-meignier
ORCID : 0000-0001-7687-073X
IdRef : 182269086

Laboratoire d'Informatique de l'Université du Mans

Résumé

This paper investigates a novel neural scoring method, based on conventional i-vectors, to perform speaker diarization and linking of large collections of recordings. Using triplet loss for training, the network projects i-vectors in a space that better separates speakers in terms of cosine similarity. Experiments are run on two French TV collections built from REPERE [1] and ETAPE [2] campaigns corpora, the system being trained on French Radio data. Results indicate that the proposed approach outperforms conventional cosine and Probabilistic Linear Discriminant Analysis scoring methods on both within-and cross-recording diarization tasks, with a Diarization Error Rate reduction of 14% in average.

Mots clés

speaker diarization neural network triplet loss

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

270_Paper.pdf (2.87 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

anthony larcher : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01818401

Soumis le : lundi 19 novembre 2018-09:26:45

Dernière modification le : mardi 17 mai 2022-11:18:02

Archivage à long terme le : mercredi 20 février 2019-13:06:29

Dates et versions

hal-01818401 , version 1 (19-11-2018)

Identifiants

HAL Id : hal-01818401 , version 1
DOI : 10.21437/Interspeech.2017-270

Citer

Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier. A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking. Interspeech 2017, Aug 2017, Stockholm, Sweden. ⟨10.21437/Interspeech.2017-270⟩. ⟨hal-01818401⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LEMANS LIUM LIUM-LST

169 Consultations

150 Téléchargements

A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager