Step-by-step and integrated approaches in broadcast news speaker diarization

This paper summarizes the collaboration of the LIA and CLIPS laboratories on speaker diarization of broadcast news during the spring NIST Rich Transcription 2003 evaluation campaign (NIST-RTÕ03S). The speaker diarization task consists of segmenting a conversation into homogeneous segments which are then grouped into speaker classes. Two approaches are described and compared for speaker diarization. The first one relies on a classical two-step speaker diarization strategy based on a detection of speaker turns followed by a clustering process, while the second one uses an integrated strategy where both segment boundaries and speaker tying of the segments are extracted simultaneously and challenged during the whole process. These two methods are used to investigate various strategies for the fusion of diarization results. Furthermore, segmentation into acoustic macro-classes is proposed and evaluated as a priori step to speaker diarization. The objective is to take advantage of the a priori acoustic information in the diariza-tion process. Along with enriching the resulting segmentation with information about speaker gender,

Domaines

Informatique et langage [cs.CL]

Fichier principal

lia-clips.pdf (368.3 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

bibliothèque Universitaire Déposants HAL-Avignon : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01318554

Soumis le : vendredi 24 mars 2017-23:29:50

Dernière modification le : jeudi 4 avril 2024-21:23:25

Archivage à long terme le : dimanche 25 juin 2017-12:32:41

Dates et versions

hal-01318554 , version 1 (24-03-2017)

Identifiants

HAL Id : hal-01318554 , version 1
DOI : 10.1016/j.csl.2005.08.002

Citer

Sylvain Meignier, Daniel Moraru, Corinne Fredouille, Jean-François Bonastre, Laurent Besacier. Step-by-step and integrated approaches in broadcast news speaker diarization. Computer Speech and Language, 2006, Odyssey 2004: The speaker and Language Recognition Workshop Odyssey-04, Odyssey 2004: The speaker and Language Recognition Workshop, 20 (2-3), pp.303-330. ⟨10.1016/j.csl.2005.08.002⟩. ⟨hal-01318554⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON UGA IMAG CNRS UNIV-LEMANS LIUM LIA POLYTECH-GRENOBLE

371 Consultations

460 Téléchargements