Skip to Main content Skip to Navigation
Journal articles

Relevance-based quantization of scattering features for unsupervised mining of environmental audio

Abstract : The emerging field of computational acoustic monitoring aims at retrieving high-level information from acoustic scenes recorded by some network of sensors. These networks gather large amounts of data requiring analysis. To decide which parts to inspect further, we need tools that automatically mine the data, identifying recurring patterns and isolated events. This requires a similarity measure for acoustic scenes that does not impose strong assumptions on the data. The state of the art in audio similarity measurement is the "bag-of-frames" approach, which models a recording using summary statistics of short-term audio descriptors, such as mel-frequency cepstral coefficients (MFCCs). They successfully characterise static scenes with little variability in auditory content , but cannot accurately capture scenes with a few salient events superimposed over static background. To overcome this issue, we propose a two-scale representation which describes a recording using clusters of scattering coefficients. The scattering coefficients capture short-scale structure, while the cluster model captures longer time scales, allowing for more accurate characterization of sparse events. Evaluation within the acoustic scene similarity framework demonstrates the interest of the proposed approach. Keywords unsupervised learning · data mining · acoustic signal processing · wavelet transforms · audio databases · content-based retrieval · nearest neighbor searches · acoustic sensors · environmental sensors.
Complete list of metadata

Cited literature [47 references]  Display  Hide  Download
Contributor : Mathieu Lagrange Connect in order to contact the contributor
Submitted on : Thursday, January 10, 2019 - 2:39:54 PM
Last modification on : Friday, August 5, 2022 - 2:54:51 PM
Long-term archiving on: : Thursday, April 11, 2019 - 3:50:27 PM


Files produced by the author(s)



Vincent Lostanlen, Grégoire Lafay, Joakim Andén, Mathieu Lagrange. Relevance-based quantization of scattering features for unsupervised mining of environmental audio. EURASIP Journal on Audio, Speech, and Music Processing, SpringerOpen, 2018, 2018 (1), ⟨10.1186/s13636-018-0138-4⟩. ⟨hal-01887403⟩



Record views


Files downloads