Person instance graphs for mono-, cross- and multi-modal person recognition in multimedia data: application to speaker identification in TV broadcast

Abstract : This work introduces a unified framework for mono-, cross-and multi-modal person recognition in multimedia data. Dubbed Person Instance Graph, it models the person recognition task as a graph mining problem: i.e. finding the best mapping between person instance vertices and identity vertices. Practically, we describe how the approach can be applied to speaker identification in TV broadcast. Then, a solution to the above-mentioned mapping problem is proposed. It relies on Integer Linear Programming to model the problem of clustering person instances based on their identity. We provide an in-depth theoretical definition of the optimization problem. Moreover, we improve two fundamental aspects of our previous related work: the problem constraints and the optimized objective function. Finally, a thorough experimental evaluation of the proposed framework is performed on a publicly available benchmark database. Depending on the graph configuration (i.e. the choice of its vertices and edges), we show that multiple tasks can be addressed interchangeably (e.g. speaker diarization, supervised or unsuper-vised speaker identification), significantly outperform-ing state-of-the-art mono-modal approaches.
Document type :
Journal articles
Complete list of metadatas

Cited literature [41 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01690350
Contributor : Claude Barras <>
Submitted on : Monday, January 22, 2018 - 11:39:31 PM
Last modification on : Saturday, May 4, 2019 - 1:18:33 AM
Long-term archiving on : Thursday, May 24, 2018 - 11:18:11 AM

File

paper.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Hervé Bredin, Anindya Roy, Viet-Bac Le, Claude Barras. Person instance graphs for mono-, cross- and multi-modal person recognition in multimedia data: application to speaker identification in TV broadcast. International Journal of Multimedia Information Retrieval, Springer, 2014, 3 (3), pp.161 - 175. ⟨10.1007/s13735-014-0055-y⟩. ⟨hal-01690350⟩

Share

Metrics

Record views

87

Files downloads

114