Skip to Main content Skip to Navigation
Conference papers

Leveraging deep neural networks with nonnegative representations for improved environmental sound classification

Abstract : This paper introduces the use of representations based on non-negative matrix factorization (NMF) to train deep neural networks with applications to environmental sound classification. Deep learning systems for sound classification usually rely on the network to learn meaningful representations from spectrograms or hand-crafted features. Instead, we introduce a NMF-based feature learning stage before training deep networks , whose usefulness is highlighted in this paper, especially for multi-source acoustic environments such as sound scenes. We rely on two established unsupervised and supervised NMF techniques to learn better input representations for deep neural networks. This will allow us, with simple architectures, to reach competitive performance with more complex systems such as convolutional networks for acoustic scene classification. The proposed systems outperform neu-ral networks trained on time-frequency representations on two acoustic scene classification datasets as well as the best systems from the 2016 DCASE challenge.
Complete list of metadatas

Cited literature [30 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01576857
Contributor : Victor Bisot <>
Submitted on : Tuesday, November 28, 2017 - 1:37:54 PM
Last modification on : Saturday, September 19, 2020 - 12:18:02 PM

File

leveraging-deep-neural.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01576857, version 1

Citation

Victor Bisot, Romain Serizel, Slim Essid, Gael Richard. Leveraging deep neural networks with nonnegative representations for improved environmental sound classification. IEEE International Workshop on Machine Learning for Signal Processing MLSP, Sep 2017, Tokyo, Japan. ⟨hal-01576857⟩

Share

Metrics

Record views

637

Files downloads

757