Toward an audiovisual attention model for multimodal video content

Naty Sidaty; Mohamed-Chaker Larabi; Abdelhakim Saadane

doi:10.1016/j.neucom.2016.08.130

Article Dans Une Revue Neurocomputing Année : 2017

Toward an audiovisual attention model for multimodal video content

, (1, 2) , (3, 1)

1
2
3

Naty Sidaty

Fonction : Auteur

Mohamed-Chaker Larabi

Fonction : Auteur
PersonId : 4182
IdHAL : mohamed-chaker-larabi
ORCID : 0000-0003-4511-5381
IdRef : 073523550

Synthèse et analyse d'images

Université de Poitiers = University of Poitiers

Abdelhakim Saadane

Fonction : Auteur

Ecole Polytechnique de l'Université de Nantes

Synthèse et analyse d'images

Résumé

Visual attention modeling is a very active research field and several image and video attention models have been proposed during the last decade. However, despite the conclusions drawn from various studies about the influence of human gazes by the presence of sound, most of the classical video attention models do not account for the multimodal nature of video (visual and auditory cues). In this paper, we propose an audiovisual saliency model with the aim to predict human gaze maps when exploring video content. The model, intended for videoconferencing, is based on the fusion of spatial, temporal and auditory attentional maps. Based on a real-time audiovisual speaker localization approach, the proposed auditory map is modulated depending of the nature of faces in the video, i.e. speaker or auditor. State-of-the-art performance measures have been used to compare the predicted saliency maps with the eye-tracking ground truth. The obtained results show the very good performance of the proposed model and a significant improvement compared to non-audio models.

Mots clés

Audiovisual saliency Talking faces Visual attention Eye-tracking Audio-visual synchrony Fusion strategies.

Domaines

Traitement du signal et de l'image [eess.SP]

Mohamed-Chaker Larabi : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01355968

Soumis le : mercredi 24 août 2016-15:32:26

Dernière modification le : vendredi 8 mars 2024-10:16:48

Dates et versions

hal-01355968 , version 1 (24-08-2016)

Identifiants

HAL Id : hal-01355968 , version 1
DOI : 10.1016/j.neucom.2016.08.130

Citer

Naty Sidaty, Mohamed-Chaker Larabi, Abdelhakim Saadane. Toward an audiovisual attention model for multimodal video content. Neurocomputing, 2017, 259, pp.94 - 111. ⟨10.1016/j.neucom.2016.08.130⟩. ⟨hal-01355968⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNILIM UNIV-NANTES CNRS UNIV-POITIERS XLIM XLIM-ASALI NANTES-UNIVERSITE

170 Consultations

0 Téléchargements

Toward an audiovisual attention model for multimodal video content

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager