Speaker diarization: about whom the speaker is talking?

Julie Mauclair; Sylvain Meignier; Yannick Estève

Communication Dans Un Congrès Année : 2006

Speaker diarization: about whom the speaker is talking?

(1) , (1) , (1)

Julie Mauclair

Fonction : Auteur
PersonId : 737843
IdHAL : julie-mauclair
ORCID : 0000-0002-2740-5118
IdRef : 139045953

Laboratoire d'Informatique de l'Université du Mans

Sylvain Meignier

Fonction : Auteur
PersonId : 11674
IdHAL : sylvain-meignier
ORCID : 0000-0001-7687-073X
IdRef : 182269086

Laboratoire d'Informatique de l'Université du Mans

Yannick Estève

Fonction : Auteur
PersonId : 11645
IdHAL : yannick-esteve
ORCID : 0000-0002-3656-8883
IdRef : 070531668

Laboratoire d'Informatique de l'Université du Mans

Résumé

The automatic speaker diarization consists in splitting the signal into homogeneous segments and clustering them by speakers. However the speaker segments are specified with anonymous labels. This pa- per proposed a solution to identify those speakers by extracting their full names pronounced in the show. With a semantic classification tree automatically built on a training corpus, the full names detected in transcription of a segment are associated to this segment or to one of its neighbors. Then, a merging method allows to associate a full name to a speaker cluster instead of a anonymous label provided by the diarization. The experiments are carried out over French broadcast news records from the ESTER 2005 evaluation campaign. About 70% show duration is correctly processed for both development and eval- uation corpora. On the evaluation corpus, 18.15% show duration is wrongly named and no decision is taken for 11.91% show duration.

Mots clés

speaker diarization speaker identification

Domaines

Informatique et langage [cs.CL]

Fichier principal

odyssey.pdf (325.62 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

sylvain meignier : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01434121

Soumis le : jeudi 9 février 2017-14:14:03

Dernière modification le : mercredi 6 janvier 2021-10:30:02

Archivage à long terme le : mercredi 10 mai 2017-13:52:53

Dates et versions

hal-01434121 , version 1 (09-02-2017)

Identifiants

HAL Id : hal-01434121 , version 1

Citer

Julie Mauclair, Sylvain Meignier, Yannick Estève. Speaker diarization: about whom the speaker is talking?. IEEE Speaker Odyssey 2006, 2006, San Juan Puerto Rico. ⟨hal-01434121⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LEMANS LIUM LIUM-LST

480 Consultations

278 Téléchargements

Speaker diarization: about whom the speaker is talking?

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager