CRF based context modeling for person identification in broadcast videos

Paul Gay; Sylvain Meignier; Jean-Marc Odobez; Paul Deléglise

doi:10.3389/fict.2016.00009

Article Dans Une Revue Frontiers in information and communication technologies Année : 2016

CRF based context modeling for person identification in broadcast videos

(1, 2) , (1) , (2) , (1)

1
2

Paul Gay

Fonction : Auteur

Laboratoire d'Informatique de l'Université du Mans

IDIAP Research Institute

Sylvain Meignier

Fonction : Auteur
PersonId : 11674
IdHAL : sylvain-meignier
ORCID : 0000-0001-7687-073X
IdRef : 182269086

Laboratoire d'Informatique de l'Université du Mans

Jean-Marc Odobez

Fonction : Auteur

IDIAP Research Institute

Paul Deléglise

Fonction : Auteur
PersonId : 998324

Laboratoire d'Informatique de l'Université du Mans

Résumé

We are investigating the problem of speaker and face identification in broadcast videos. Identification is performed by associating automatically extracted names from overlaid texts with speaker and face clusters. We aimed at exploiting the structure of news videos to solve name/cluster association ambiguities and clustering errors. The proposed approach combines iteratively two conditional random fields (CRF). The first CRF performs the person diarization (joint temporal segmentation, clustering, and association of voices and faces) jointly over the speech segments and the face tracks. It benefits from contextual information being extracted from the image backgrounds and the overlaid texts. The second CRF associates names with person clusters, thanks to co-occurrence statistics. Experiments conducted on a recent and substantial public dataset containing reports and debates demonstrate the interest and complementarity of the different modeling steps and information sources: the use of these elements enables us to obtain better performances in clustering and identification, especially in studio scenes.

Domaines

Informatique et langage [cs.CL]

Fichier principal

Gay_FRONTIERS-CIA_2016.pdf (10.06 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

sylvain meignier : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01433154

Soumis le : mardi 21 mars 2017-23:20:05

Dernière modification le : mardi 8 décembre 2020-10:09:38

Archivage à long terme le : jeudi 22 juin 2017-14:58:49

Dates et versions

hal-01433154 , version 1 (21-03-2017)

Identifiants

HAL Id : hal-01433154 , version 1
DOI : 10.3389/fict.2016.00009

Citer

Paul Gay, Sylvain Meignier, Jean-Marc Odobez, Paul Deléglise. CRF based context modeling for person identification in broadcast videos. Frontiers in information and communication technologies, 2016, 3, pp.9. ⟨10.3389/fict.2016.00009⟩. ⟨hal-01433154⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LEMANS LIUM LIUM-LST ANR

121 Consultations

53 Téléchargements

CRF based context modeling for person identification in broadcast videos

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager