I-vectors and ILP clustering adapted to cross-show speaker diarization

Abstract : We propose to study speaker diarization from a collection of audio documents. The goal is to detect speakers appearing in several shows. In our approach, each show of the collection is processed separately before being processed collectively , to group speakers involved in several shows. Two clustering methods are studied for the overall processing of the collection: one uses the NCLR metric and the other is inspired by techniques based on i-vectors, mainly used in the speaker verification field. Both methods were evaluated on the whole training corpus of ESTER 2. The method based on the use of i-vectors achieves error rates similar to those obtained by the NCLR method, however, the computation time is on average 8.66 times faster. Therefore, this method is suitable for processing large volumes of data.
Type de document :
Communication dans un congrès
Interspeech, 2012, Portland, Oregon (USA), United States. Interspeech 2012, 2012
Liste complète des métadonnées

Littérature citée [9 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01450711
Contributeur : Hakim Amokrane <>
Soumis le : lundi 3 avril 2017 - 21:50:48
Dernière modification le : jeudi 6 avril 2017 - 10:00:10
Document(s) archivé(s) le : mardi 4 juillet 2017 - 14:52:13

Fichier

i12_2174.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : hal-01450711, version 1

Collections

Citation

Grégor Dupuy, Mickael Rouvier, Sylvain Meignier, Yannick Estève. I-vectors and ILP clustering adapted to cross-show speaker diarization. Interspeech, 2012, Portland, Oregon (USA), United States. Interspeech 2012, 2012. 〈hal-01450711〉

Partager

Métriques

Consultations de la notice

156

Téléchargements de fichiers

62