Audio Visual Integration with Competing Sources in the Framework of Audio Visual Speech Scene Analysis

Attigodu Ganesh 1, * Frédéric Berthommier 1 Jean-Luc Schwartz 1
* Corresponding author
GIPSA-DPC - Département Parole et Cognition
Abstract : We introduce “Audio-Visual Speech Scene Analysis” (AVSSA) as an extension of the two-stage Auditory Scene Analysis model towards audiovisual scenes made of mixtures of speakers. AVSSA assumes that a coherence index between the auditory and the visual input is computed prior to audiovisual fusion, enabling to determine whether the sensory inputs should be bound together. Previous experiments on the modulation of the McGurk effect by audiovisual coherent vs. incoherent contexts presented before the McGurk target have provided experimental evidence supporting AVSSA. Indeed, incoherent contexts appear to decrease the McGurk effect, suggesting that they produce lower audiovisual coherence hence less audiovisual fusion. The present experiments extend the AVSSA paradigm by creating contexts made of competing audiovisual sources and measuring their effect on McGurk targets. The competing audiovisual sources have respectively a high and a low audiovisual coherence (that is, large vs. small audiovisual comodulations in time). The first experiment involves contexts made of two auditory sources and one video source associated to either the first or the second audio source. It appears that the McGurk effect is smaller after the context made of the visual source associated to the auditory source with less audiovisual coherence. In the second experiment with the same stimuli, the participants are asked to attend to either one or the other source. The data show that the modulation of fusion depends on the attentional focus. Altogether, these two experiments shed light on audiovisual binding, the AVSSA process and the role of attention.
Complete list of metadatas

Cited literature [21 references]  Display  Hide  Download
Contributor : Frédéric Berthommier <>
Submitted on : Thursday, December 22, 2016 - 3:43:15 PM
Last modification on : Monday, April 9, 2018 - 12:22:49 PM
Long-term archiving on : Tuesday, March 21, 2017 - 1:26:16 AM


Publisher files allowed on an open archive




Attigodu Ganesh, Frédéric Berthommier, Jean-Luc Schwartz. Audio Visual Integration with Competing Sources in the Framework of Audio Visual Speech Scene Analysis . van Dijk P., Başkent D., Gaudrain E., de Kleine E., Wagner A., Lanting C. Physiology, Psychoacoustics and Cognition in Normal and Impaired Hearing, 894, Springer, pp.399-408, 2016, Advances in Experimental Medicine and Biology, ⟨10.1007/978-3-319-25474-6_42⟩. ⟨hal-01421589⟩



Record views


Files downloads