An audiovisual attention model for natural conversation scenes - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

An audiovisual attention model for natural conversation scenes

Résumé

Classical visual attention models neither consider social cues, such as faces, nor auditory cues, such as speech. However, faces are known to capture visual attention more than any other visual features, and recent studies showed that speech turn-taking affects the gaze of non-involved viewers. In this paper, we propose an audiovisual saliency model able to predict the eye movements of observers viewing other people having a conversation. Thanks to a speaker diarization algorithm, our audiovisual saliency model increases the saliency of the speakers compared to the addressees. We evaluated our model with eye-tracking data, and found that it significantly outperforms visual attention models using an equal and constant saliency value for all faces.
Fichier principal
Vignette du fichier
Coutrot_ICIP2014.pdf (1.96 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01009467 , version 1 (18-06-2014)

Identifiants

  • HAL Id : hal-01009467 , version 1

Citer

Antoine Coutrot, Nathalie Guyader. An audiovisual attention model for natural conversation scenes. ICIP 2014 - 21st IEEE International Conference on Image Processing, Oct 2014, Paris, France. pp.1-5. ⟨hal-01009467⟩
294 Consultations
285 Téléchargements

Partager

Gmail Facebook X LinkedIn More