Speaker Role Recognition on TV Broadcast Documents

Abstract : In this paper, we present the results obtained by a state-of- the-art system for Speaker Role Recognition (SRR) on the TV broadcast documents issued from the REPERE Multimedia Challenge. This SRR system is based on the assumption that cues about speaker roles may be extracted from a set of 36 low level features issued from the outputs of a Speaker Diarization process. Starting from manually annotated speaker segments, we first evaluate the performance of the SRR system, formerly evaluated on Broadcast radio recordings, on this heterogeneous set of TV shows. Consequently, we propose a new classification strategy, by observing how building show-dependent models improves SRR. The system is then applied on some speaker seg- mentation outputs issued from an automatic system, enabling us to investigate the influence of the errors introduced by this front-end process on Role Recognition. In these different con- texts, the system is able to correctly classify 86.9% of speaker roles while being applied on manual speaker segmentations and 74.5% on automatic Speaker Diarization outputs. Index Terms : speaker role recognition, speech processing, content-based indexing of audiovisual docume
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01339093
Contributor : Bibliothèque Universitaire Déposants Hal-Avignon <>
Submitted on : Wednesday, June 29, 2016 - 3:24:30 PM
Last modification on : Friday, March 29, 2019 - 2:36:04 PM

Identifiers

  • HAL Id : hal-01339093, version 1

Collections

Citation

Benjamin Bigot, Corinne Fredouille, Delphine Charlet. Speaker Role Recognition on TV Broadcast Documents. First Workshop on Speech, Language and Audio in Multimedia (SLAM), Aug 2013, Marseille, France. ⟨hal-01339093⟩

Share

Metrics

Record views

34