Comparing Multi-Stage Approaches for Cross-Show Speaker Diarization

Viet-Anh Tran; Viet Bac Le; Claude Barras; Lori Lamel

Communication Dans Un Congrès Année : 2011

Comparing Multi-Stage Approaches for Cross-Show Speaker Diarization

(1) , (2) , (1) , (1)

1
2

Viet-Anh Tran

Fonction : Auteur
PersonId : 907457

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Viet Bac Le

Fonction : Auteur

Vocapia Research [Orsay]

Claude Barras

Fonction : Auteur
PersonId : 17217
IdHAL : claude-barras
IdRef : 165065583

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Lori Lamel

Fonction : Auteur
PersonId : 15965
IdHAL : lori-lamel
ORCID : 0000-0001-7443-9938
IdRef : 127578056

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Résumé

Acoustic speaker diarization is investigated for situations where a collection of shows from the same source needs to be processed. In this case, the same speaker should receive the same label across all shows. We compare different architectures for cross-show speaker diarization: the obvious concatenation of all shows, a hybrid system combining first a local clustering stage followed by a global clustering stage, and an incremental system which processes the shows in a predefined order and updates the speaker models accordingly. This latter system being best suited to real applicative situations. These three strategies were compared to a baseline single-show system on a set of 46 ten-minutes samples of British English scientific podcasts.

Mots clés

speaker segmentation and clustering cross-show diarization speaker diarization

Domaines

Traitement du signal et de l'image [eess.SP] Informatique [cs]

Fichier principal

i11_1053.pdf (277.03 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Claude Barras : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01690265

Soumis le : mardi 23 janvier 2018-17:13:57

Dernière modification le : samedi 7 octobre 2023-21:36:20

Archivage à long terme le : jeudi 24 mai 2018-09:36:21

Dates et versions

hal-01690265 , version 1 (23-01-2018)

Identifiants

HAL Id : hal-01690265 , version 1

Citer

Viet-Anh Tran, Viet Bac Le, Claude Barras, Lori Lamel. Comparing Multi-Stage Approaches for Cross-Show Speaker Diarization. Interspeech 2011, Aug 2011, Florence, Italy. ⟨hal-01690265⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LIMSI SORBONNE-UNIVERSITE LISN

73 Consultations

98 Téléchargements

Comparing Multi-Stage Approaches for Cross-Show Speaker Diarization

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager