Extracting true speaker identities from transcriptions

Abstract : Automatic speaker diarization generally produces a generic label such a spkr1 rather than the true identity of the speaker. Recently, two approaches based on lexical rules were proposed to extract the true identity of the speaker from the transcriptions of the audio recording without any a priori acoustic information: one uses n-gram, the other one uses semantic classification trees (SCT). The latter was proposed by the authors of this paper. In this paper, the two methods are compared in experiments carried out on French broadcast news records from the ESTER 2005 evaluation campaign. Experiments are processed on manual and automatic transcriptions. On manual transcriptions, the n-gram-based approach can be more precise, but the automatic transcriptions, the SCT-based approach gives significantly the best results in terms of recall and precision.
Type de document :
Communication dans un congrès
Interspeech 2007, 2007, Antwerp, Belgium. Interspeech 2007, 2007
Liste complète des métadonnées

Littérature citée [10 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01434096
Contributeur : Sylvain Meignier <>
Soumis le : lundi 3 avril 2017 - 22:27:04
Dernière modification le : jeudi 6 avril 2017 - 10:01:50
Document(s) archivé(s) le : mardi 4 juillet 2017 - 14:53:08

Fichier

i07_2601.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : hal-01434096, version 1

Collections

Citation

Yannick Estève, Sylvain Meignier, Paul Deléglise, Julie Mauclair. Extracting true speaker identities from transcriptions. Interspeech 2007, 2007, Antwerp, Belgium. Interspeech 2007, 2007. 〈hal-01434096〉

Partager

Métriques

Consultations de la notice

87

Téléchargements de fichiers

26