Skip to Main content Skip to Navigation
Conference papers

Evaluation of Post-Processing Algorithms for Polyphonic Sound Event Detection

Abstract : Sound event detection (SED) aims at identifying sound events (audio tagging task) in recordings and then locating them temporally (segmentation task). This last task ends with the segmentation of the frame-level class predictions, that determines the onsets and offsets of the sound events. This step is often overlooked in scientific publications. In this paper, we focus on the post-processing algorithms used to identify the sound event boundaries. Different post-processing steps are investigated through smoothing, thresholding, and optimization. In particular, we evaluate different approaches for temporal segmentation, namely statistics-based and parametric methods. Experiments were carried out on the DCASE 2018 challenge task 4 data. We compared post-processing algorithms on the temporal prediction curves of two models: one based on the challenge's baseline and one based on Multiple Instance Learning (MIL). Results show the crucial impact of the post-processing methods on the final detection scores. When using ground truth audio tags to retain the final temporal predictions of interest, statistics-based methods yielded a 29.9% event-based F-score on the evaluation set with MIL. Moreover, the best results were obtained using class-dependent parametric methods with a 43.9% F-score. The post-processing methods and optimization algorithms have been compiled into a Python library named "aeseg".
Document type :
Conference papers
Complete list of metadata

Cited literature [23 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02942302
Contributor : Open Archive Toulouse Archive Ouverte (oatao) <>
Submitted on : Thursday, September 17, 2020 - 4:57:59 PM
Last modification on : Friday, August 27, 2021 - 11:26:03 AM
Long-term archiving on: : Thursday, December 3, 2020 - 10:23:00 AM

File

cances_26338.pdf
Files produced by the author(s)

Identifiers

Citation

Léo Cances, Patrice Guyot, Thomas Pellegrini. Evaluation of Post-Processing Algorithms for Polyphonic Sound Event Detection. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2019), Oct 2019, New Paltz, NY, United States. pp.318-322, ⟨10.1109/WASPAA.2019.8937143⟩. ⟨hal-02942302⟩

Share

Metrics

Record views

34

Files downloads

109