Speaker diarization: about whom the speaker is talking?

Abstract : The automatic speaker diarization consists in splitting the signal into homogeneous segments and clustering them by speakers. However the speaker segments are specified with anonymous labels. This pa- per proposed a solution to identify those speakers by extracting their full names pronounced in the show. With a semantic classification tree automatically built on a training corpus, the full names detected in transcription of a segment are associated to this segment or to one of its neighbors. Then, a merging method allows to associate a full name to a speaker cluster instead of a anonymous label provided by the diarization. The experiments are carried out over French broadcast news records from the ESTER 2005 evaluation campaign. About 70% show duration is correctly processed for both development and eval- uation corpora. On the evaluation corpus, 18.15% show duration is wrongly named and no decision is taken for 11.91% show duration.
Type de document :
Communication dans un congrès
IEEE Speaker Odyssey 2006, 2006, San Juan Puerto Rico. 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop 2006
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01434121
Contributeur : Sylvain Meignier <>
Soumis le : jeudi 9 février 2017 - 14:14:03
Dernière modification le : jeudi 21 décembre 2017 - 00:56:46
Document(s) archivé(s) le : mercredi 10 mai 2017 - 13:52:53

Fichier

odyssey.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01434121, version 1

Collections

Citation

Julie Mauclair, Sylvain Meignier, Yannick Estève. Speaker diarization: about whom the speaker is talking?. IEEE Speaker Odyssey 2006, 2006, San Juan Puerto Rico. 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop 2006. 〈hal-01434121〉

Partager

Métriques

Consultations de la notice

171

Téléchargements de fichiers

73