Automatic Detection of Screams and Shouts in the Metro

Abstract : This study proposes a security/surveillance system capable of automatically recognizing and detecting screams and shouts in a metro, based on the theory of classification through statistical modeling. Using a database recorded from enactments of violent scenes inside a Paris metro running its course, we estimated statistical models from three different neural network architectures (DNN, CNN and RNN/LSTM). The models were first trained to recognize three categories of sounds (shouts, speech and background noise), then introducing more categories to describe the surrounding environment (in order to bring some contextual information), considering the data as isolated sound events or as a continuous audio stream. The results obtained speak to the higher modeling power of the temporal model which takes into account the temporal structure of sound events. The scores for the Classification of the three categories shout, speech and background turned out to be quite satisfying, regardless of the rest of the acoustic environment, and adding contextual information proved useful. During this study we observed that the lack of data is a major limiting factor, which could be circumvented by using transfer learning, which consists in using more complex networks pre-trained with different data, as well as data augmentation techniques, consisting in increasing the amount of data by creating synthetic data from existing ones.
Complete list of metadatas

Cited literature [131 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/tel-02202080
Contributor : Ifsttar Cadic <>
Submitted on : Wednesday, July 31, 2019 - 3:34:11 PM
Last modification on : Thursday, August 1, 2019 - 2:11:44 AM

File

doc00029028.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02202080, version 1

Collections

Citation

Pierre Laffitte. Automatic Detection of Screams and Shouts in the Metro. Signal and Image processing. Université Lille 1 Nord de France, 2017. English. ⟨tel-02202080⟩

Share

Metrics

Record views

90

Files downloads

11