How saliency, faces, and sound influence gaze in dynamic social scenes - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of Vision Année : 2014

How saliency, faces, and sound influence gaze in dynamic social scenes

Résumé

Conversation scenes are a typical example in which classical models of visual attention dramatically fail to predict eye positions. Indeed, these models rarely consider faces as particular gaze attractors and never take into account the important auditory information that always accompanies dynamic social scenes. We recorded the eye movements of participants viewing dynamic conversations taking place in various contexts. Conversations were seen either with their original soundtracks or with unrelated soundtracks (unrelated speech and abrupt or continuous natural sounds). First, we analyze how auditory conditions influence the eye movement parameters of participants. Then, we model the probability distribution of eye positions across each video frame with a statistical method (Expectation- Maximization), allowing the relative contribution of different visual features such as static low-level visual saliency (based on luminance contrast), dynamic low- level visual saliency (based on motion amplitude), faces, and center bias to be quantified. Through experimental and modeling results, we show that regardless of the auditory condition, participants look more at faces, and especially at talking faces. Hearing the original soundtrack makes participants follow the speech turn-taking more closely. However, we do not find any difference between the different types of unrelated soundtracks. These eye- tracking results are confirmed by our model that shows that faces, and particularly talking faces, are the features that best explain the gazes recorded, especially in the original soundtrack condition. Low-level saliency is not a relevant feature to explain eye positions made on social scenes, even dynamic ones. Finally, we propose groundwork for an audiovisual saliency model.
Fichier principal
Vignette du fichier
Coutrot_JoV2014.pdf (990.17 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01018237 , version 1 (03-07-2014)

Identifiants

Citer

Antoine Coutrot, Nathalie Guyader. How saliency, faces, and sound influence gaze in dynamic social scenes. Journal of Vision, 2014, 14 (8), pp.1-17. ⟨10.1167/14.8.5⟩. ⟨hal-01018237⟩
298 Consultations
283 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More