Hierarchical Multi-Label Propagation using Speaking Face Graphs for Multimodal Person Discovery - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Multimedia Tools and Applications Année : 2020

Hierarchical Multi-Label Propagation using Speaking Face Graphs for Multimodal Person Discovery

Résumé

TV archives are growing in size so fast that manually indexing becomes unfeasible. Automatic indexing techniques can be applied to overcome this issue, and this work proposes an unsupervised technique for multimodal person discovery. To achieve this goal, we propose a hierarchical label propagation technique based on quasi-flat zones theory, that learns from labeled and unlabeled data and propagates names through a multimodal graph representation. In this representation, we combine audio, video, and text processing techniques to model the data as a graph of speaking faces. In the proposed mod-eling, we extract names via optical character recognition and propagate them through the graph using audiovisual relationships between speaking faces. We also use a random walk label propagation and two graph clustering strategies to serve as baselines. The proposed label propagation techniques always outper-form the clustering baselines on the quantitative assessments. Our approach also outperforms all literature methods tested on the same dataset except for one, which uses a different preprocessing step. The proposed hierarchical label propagation and the random walk baseline produce highly equivalent results according to the Kappa coefficient, but the hierarchical propagation is parameter-free and over 9 times faster than the random walk under the same configurations.
Fichier principal
Vignette du fichier
MTAP-MPD.pdf (1.03 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02926035 , version 1 (31-08-2020)

Identifiants

Citer

Gabriel Barbosa da Fonseca, Gabriel Sargent, Ronan Sicre, Zenilton Kleber Gonçalves Do Patrocinio, Guillaume Gravier, et al.. Hierarchical Multi-Label Propagation using Speaking Face Graphs for Multimodal Person Discovery. Multimedia Tools and Applications, inPress, pp.1-27. ⟨10.1007/s11042-020-09692-x⟩. ⟨hal-02926035⟩
120 Consultations
240 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More