A study of lip movements during spontaneous dialog and its application to voice activity detection

David Sodoyer 1 Bertrand Rivet 2 Laurent Girin 2 Christophe Savariaux 3 Jean-Luc Schwartz 4 Christian Jutten 5
2 GIPSA-MPACIF - MPACIF
GIPSA-DPC - Département Parole et Cognition
3 GIPSA-Services - GIPSA-Services
GIPSA-lab - Grenoble Images Parole Signal Automatique
4 GIPSA-PMD - PMD
GIPSA-DPC - Département Parole et Cognition
5 GIPSA-SIGMAPHY - SIGMAPHY
GIPSA-DIS - Département Images et Signal
Abstract : This paper presents a quantitative and comprehensive study of the lip movements of a given speaker in different speech/nonspeech contexts, with a particular focus on silences i.e., when no sound is produced by the speaker . The aim is to characterize the relationship between "lip activity" and "speech activity" and then to use visual speech information as a voice activity detector VAD . To this aim, an original audiovisual corpus was recorded with two speakers involved in a face-to-face spontaneous dialog, although being in separate rooms. Each speaker communicated with the other using a microphone, a camera, a screen, and headphones. This system was used to capture separate audio stimuli for each speaker and to synchronously monitor the speaker's lip movements. A comprehensive analysis was carried out on the lip shapes and lip movements in either silence or nonsilence i.e., speech+nonspeech audible events . A single visual parameter, defined to characterize the lip movements, was shown to be efficient for the detection of silence sections. This results in a visual VAD that can be used in any kind of environment noise, including intricate and highly nonstationary noises, e.g., multiple and/or moving noise sources or competing speech signals.
Complete list of metadatas

Cited literature [60 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00941145
Contributor : Jean-Luc Schwartz <>
Submitted on : Monday, February 3, 2014 - 2:12:38 PM
Last modification on : Monday, July 8, 2019 - 3:10:06 PM
Long-term archiving on : Sunday, May 4, 2014 - 4:53:09 AM

File

Sodoyer_JASA_2008_text_R2.pdf
Files produced by the author(s)

Identifiers

Citation

David Sodoyer, Bertrand Rivet, Laurent Girin, Christophe Savariaux, Jean-Luc Schwartz, et al.. A study of lip movements during spontaneous dialog and its application to voice activity detection. Journal of the Acoustical Society of America, Acoustical Society of America, 2009, 125 (2), pp.1184-1196. ⟨10.1121/1.3050257⟩. ⟨hal-00941145⟩

Share

Metrics

Record views

530

Files downloads

446