Person Instance Graphs for Named Speaker Identification in TV Broadcast

Abstract : We address the problem of named speaker identification in TV broadcast which consists in answering the question " who speaks when? " with the real identity of speakers, using person names automatically obtained from speech transcripts. While existing approaches rely on a first speaker diarization step followed by a local name propagation step to speaker clusters, we propose a unified framework called person instance graph where both steps are jointly modeled as a global optimization problem, then solved using integer linear programming. Moreover, when available, acoustic speaker models can be added seamlessly to the graph structure for joint named and acoustic speaker identification – leading to a 10% error decrease (from 45% down to 35%) over a state-of-the-art i-vector speaker identification system on the REPERE TV broadcast corpus.
Document type :
Conference papers
Complete list of metadatas

Cited literature [26 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01690272
Contributor : Claude Barras <>
Submitted on : Tuesday, January 23, 2018 - 5:28:31 PM
Last modification on : Saturday, May 4, 2019 - 1:21:07 AM
Long-term archiving on : Thursday, May 24, 2018 - 9:13:27 AM

File

27.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-01690272, version 1

Collections

Citation

Hervé Bredin, Antoine Laurent, Achintya Sarkar, Viet-Bac Le, Sophie Rosset, et al.. Person Instance Graphs for Named Speaker Identification in TV Broadcast. Odyssey 2014, Jun 2014, Joensuu, Finland. ⟨hal-01690272⟩

Share

Metrics

Record views

46

Files downloads

74