Audiovisual streaming in voicing perception: new evidence for a low-level interaction between audio and visual modalities - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

Audiovisual streaming in voicing perception: new evidence for a low-level interaction between audio and visual modalities

Résumé

Speech Audio-visual (AV) interaction has been considered for redundancy and complementary properties at the phonetic level but a few experiments have shown a significant role in early auditory analysis. A new paradigm is proposed which uses the pre-voicing component (PVC) excised from a true /b/. When the so called target PVC is added up to a /p/ this leads to the clear perception of /b/. Moreover, the amplitude variation of the target PVC allows building of a perceptual continuum between /p/ when amplitude is set at 0 and /b/ at original amplitude. In the audio channel, adding a series of PVC at fixed low amplitude before and after the target allows the creation of a stream of regular sounds, which are not related to visible events. On the contrary, the bilabial aperture of the /p/ is a specific speech gesture visible in the video channel. The target PVC and the visible gesture are also not redundant events. Then, depending on its intensity level, the target PVC added to an audio /p/ could be either embedded to a stream of other PVCs or phonetically fused to perceive /b/. To study the competition between these two alternatives and the role of the AV interaction, we use a 2*2 factorial design to contrast Clear/Stream and Audio/AV conditions with a control of the amplitude of the target PVC. There is no stream of PVCs in the "Clear" condition for providing the baseline. The streaming effect by itself is significant in the audio condition, but the novelty is that we find a strong AV interaction. When a stream of PVCs is present, in the "AV" condition, the rate of perceived /p/ is higher than in the "Audio" condition, suggesting that the video lip opening gestures increases the trend to isolate the formant trajectory towards the vowel from the PVC, hence increasing the perception of unvoiced stimuli. We conclude that the process of low level audio streaming is reinforced when the visual information is not redundant, and that, in this case, the phonetic fusion of the voicing cue is disadvantaged by visual information.
Fichier principal
Vignette du fichier
AVSP2011-Berthommier-Schwartz.pdf (431.18 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00642648 , version 1 (18-11-2011)

Identifiants

  • HAL Id : hal-00642648 , version 1

Citer

Frédéric Berthommier, Jean-Luc Schwartz. Audiovisual streaming in voicing perception: new evidence for a low-level interaction between audio and visual modalities. AVSP 2011 - 10th International Conference on Auditory-Visual Speech Processing, Aug 2011, Volterra, Italy. pp.77-80. ⟨hal-00642648⟩
354 Consultations
67 Téléchargements

Partager

Gmail Facebook X LinkedIn More