Relevance-based quantization of scattering features for unsupervised mining of environmental audio

Abstract : The emerging field of computational acoustic monitoring aims at retrieving high-level information from acoustic scenes recorded by some network of sensors. These networks gather large amounts of data requiring analysis. To decide which parts to inspect further, we need tools that automatically mine the data, identifying recurring patterns and isolated events. This requires a similarity measure for acoustic scenes that does not impose strong assumptions on the data. The state of the art in audio similarity measurement is the "bag-of-frames" approach, which models a recording using summary statistics of short-term audio descriptors, such as mel-frequency cepstral coefficients (MFCCs). They successfully characterise static scenes with little variability in auditory content , but cannot accurately capture scenes with a few salient events superimposed over static background. To overcome this issue, we propose a two-scale representation which describes a recording using clusters of scattering coefficients. The scattering coefficients capture short-scale structure, while the cluster model captures longer time scales, allowing for more accurate characterization of sparse events. Evaluation within the acoustic scene similarity framework demonstrates the interest of the proposed approach. Keywords unsupervised learning · data mining · acoustic signal processing · wavelet transforms · audio databases · content-based retrieval · nearest neighbor searches · acoustic sensors · environmental sensors.
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01887403
Contributeur : Mathieu Lagrange <>
Soumis le : jeudi 10 janvier 2019 - 14:39:54
Dernière modification le : mardi 26 mars 2019 - 09:25:22

Fichier

lostanlenRelevanceScattering.p...
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Vincent Lostanlen, Grégoire Lafay, Joakim Andén, Mathieu Lagrange. Relevance-based quantization of scattering features for unsupervised mining of environmental audio. EURASIP Journal on Audio, Speech, and Music Processing, SpringerOpen, 2018, 2018 (1), 〈10.1186/s13636-018-0138-4〉. 〈hal-01887403〉

Partager

Métriques

Consultations de la notice

60

Téléchargements de fichiers

9