Multiobjective Time Series Matching for Audio Classification and Retrieval - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue IEEE Transactions on Audio, Speech and Language Processing Année : 2013

Multiobjective Time Series Matching for Audio Classification and Retrieval

Résumé

Seeking sound samples in a massive database can be a tedious and time consuming task. Even when metadata are available , query results may remain far from the timbre expected by users. This problem stems from the nature of query specification, which does not account for the underlying complexity of audio data. The Query By Example (QBE) paradigm tries to tackle this shortcoming by finding audio clips similar to a given sound example. However, it requires users to have a well-formed soundfile of what they seek, which is not always a valid assumption. Furthermore , most audio-retrieval systems rely on a single measure of similarity, which is unlikely to convey the perceptual similarity of audio signals. We address in this paper an innovative way of querying generic audio databases by simultaneously optimizing the temporal evolution of multiple spectral properties. We show how this problem can be cast into a new approach merging multiob-jective optimization and time series matching, called MultiObjec-tive Time Series (MOTS) matching. We formally state this problem and report an efficient implementation. This approach introduces a multidimensional assessment of similarity in audio matching. This allows to cope with the multidimensional nature of timbre perception and also to obtain a set of efficient propositions rather than a single best solution. To demonstrate the performances of our approach , we show its efficiency in audio classification tasks. By introducing a selection criterion based on the hypervolume dominated by a class, we show that our approach outstands the state-of-art methods in audio classification even with a few number of features. We demonstrate its robustness to several classes of audio distortions. Finally, we introduce two innovative applications of our method for sound querying.
Fichier principal
Vignette du fichier
manuscript_double.pdf (2.03 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01577892 , version 1 (28-08-2017)

Identifiants

Citer

Philippe Esling, Carlos Agon. Multiobjective Time Series Matching for Audio Classification and Retrieval. IEEE Transactions on Audio, Speech and Language Processing, 2013, 21 (10), pp.2057-2072. ⟨10.1109/TASL.2013.2265086⟩. ⟨hal-01577892⟩
83 Consultations
500 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More