Tag Propagation Approaches within Speaking Face Graphs for Multimodal Person Discovery

Abstract : The indexing of broadcast TV archives is a current problem in multimedia research. As the size of these databases grows continuously, meaningful features are needed to describe and connect their elements efficiently, such as the identification of speaking faces. In this context, this paper focuses on two approaches for unsupervised person discovery. Initial tagging of speaking faces is provided by an OCR-based method, and these tags propagate through a graph model based on audiovisual relations between speaking faces. Two propagation methods are proposed, one based on random walks and the other based on a hierarchical approach. To better evaluate their performances, these methods were compared with two graph clustering baselines. We also study the impact of different modality fusions on the graph-based tag propagation scenario. From a quantitative analysis, we observed that the graph propagation techniques always outperform the baselines. Among all compared strategies, the methods based on hierarchical propagation with late fusion and random walk with score-fusion obtained the highest MAP values. Finally, even though these two methods produce highly equivalent results according to Kappa coefficient, the random walk method performs better according to a paired t-test, and the computing time for the hierarchical propagation is more than 4 times lower than the one for the random walk propagation.
Type de document :
Communication dans un congrès
International workshop on Content-Based Multimedia Indexing (CBMI), Jun 2017, Firenze, Italy. 2017, Proceedings of the 15th international workshop on Content-Based Multimedia Indexing
Liste complète des métadonnées

Littérature citée [17 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01551648
Contributeur : Gabriel Sargent <>
Soumis le : vendredi 30 juin 2017 - 14:04:54
Dernière modification le : mardi 21 novembre 2017 - 15:23:53

Fichier

2017-conf-cbmi-tag-propagation...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01551648, version 1

Citation

Gabriel Barbosa Da Fonseca, Gabriel Sargent, Izabela Lyon Freire, Ronan Sicre, Zenilton Patrocinio Jr, et al.. Tag Propagation Approaches within Speaking Face Graphs for Multimodal Person Discovery. International workshop on Content-Based Multimedia Indexing (CBMI), Jun 2017, Firenze, Italy. 2017, Proceedings of the 15th international workshop on Content-Based Multimedia Indexing. 〈hal-01551648〉

Partager

Métriques

Consultations de la notice

250

Téléchargements de fichiers

71