Leveraging deep neural networks with nonnegative representations for improved environmental sound classification

This paper introduces the use of representations based on non-negative matrix factorization (NMF) to train deep neural networks with applications to environmental sound classification. Deep learning systems for sound classification usually rely on the network to learn meaningful representations from spectrograms or hand-crafted features. Instead, we introduce a NMF-based feature learning stage before training deep networks , whose usefulness is highlighted in this paper, especially for multi-source acoustic environments such as sound scenes. We rely on two established unsupervised and supervised NMF techniques to learn better input representations for deep neural networks. This will allow us, with simple architectures, to reach competitive performance with more complex systems such as convolutional networks for acoustic scene classification. The proposed systems outperform neu-ral networks trained on time-frequency representations on two acoustic scene classification datasets as well as the best systems from the 2016 DCASE challenge.

Mots clés

Nonnegative Matrix Factorization Sound Classification Deep Neural Networks

Domaines

Apprentissage [cs.LG] Son [cs.SD]

Fichier principal

leveraging-deep-neural.pdf (148.74 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Victor Bisot : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01576857

Soumis le : mardi 28 novembre 2017-13:37:54

Dernière modification le : lundi 9 octobre 2023-12:49:39

Dates et versions

hal-01576857 , version 1 (28-11-2017)

Identifiants

HAL Id : hal-01576857 , version 1

Citer

Victor Bisot, Romain Serizel, Slim Essid, Gael Richard. Leveraging deep neural networks with nonnegative representations for improved environmental sound classification. IEEE International Workshop on Machine Learning for Signal Processing MLSP, Sep 2017, Tokyo, Japan. ⟨hal-01576857⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS INRIA PARISTECH UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD LTCI IDS S2A

445 Consultations

843 Téléchargements