Multiobjective Time Series Matching for Audio Classification and Retrieval

Philippe Esling; Carlos Agon

doi:10.1109/TASL.2013.2265086

Article Dans Une Revue IEEE Transactions on Audio, Speech and Language Processing Année : 2013

Multiobjective Time Series Matching for Audio Classification and Retrieval

(1) , (2)

1
2

Philippe Esling

Fonction : Auteur
PersonId : 14916
IdHAL : philippe-esling
ORCID : 0000-0002-1655-7909
IdRef : 172472873

Sciences et Technologies de la Musique et du Son

Carlos Agon

Fonction : Auteur

Représentations musicales

Résumé

Seeking sound samples in a massive database can be a tedious and time consuming task. Even when metadata are available , query results may remain far from the timbre expected by users. This problem stems from the nature of query specification, which does not account for the underlying complexity of audio data. The Query By Example (QBE) paradigm tries to tackle this shortcoming by finding audio clips similar to a given sound example. However, it requires users to have a well-formed soundfile of what they seek, which is not always a valid assumption. Furthermore , most audio-retrieval systems rely on a single measure of similarity, which is unlikely to convey the perceptual similarity of audio signals. We address in this paper an innovative way of querying generic audio databases by simultaneously optimizing the temporal evolution of multiple spectral properties. We show how this problem can be cast into a new approach merging multiob-jective optimization and time series matching, called MultiObjec-tive Time Series (MOTS) matching. We formally state this problem and report an efficient implementation. This approach introduces a multidimensional assessment of similarity in audio matching. This allows to cope with the multidimensional nature of timbre perception and also to obtain a set of efficient propositions rather than a single best solution. To demonstrate the performances of our approach , we show its efficiency in audio classification tasks. By introducing a selection criterion based on the hypervolume dominated by a class, we show that our approach outstands the state-of-art methods in audio classification even with a few number of features. We demonstrate its robustness to several classes of audio distortions. Finally, we introduce two innovative applications of our method for sound querying.

Mots clés

pattern matching Pareto optimization music information retrieval Audio databases classification algorithms content-based retrieval data mining data structures indexing multi-media databases query processing time series analysis

Domaines

Son [cs.SD] Recherche d'information [cs.IR] Traitement du signal et de l'image [eess.SP]

Fichier principal

manuscript_double.pdf (2.03 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Philippe Esling : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01577892

Soumis le : lundi 28 août 2017-15:57:47

Dernière modification le : vendredi 24 mars 2023-14:53:05

Dates et versions

hal-01577892 , version 1 (28-08-2017)

Identifiants

HAL Id : hal-01577892 , version 1
DOI : 10.1109/TASL.2013.2265086

Citer

Philippe Esling, Carlos Agon. Multiobjective Time Series Matching for Audio Classification and Retrieval. IEEE Transactions on Audio, Speech and Language Processing, 2013, 21 (10), pp.2057-2072. ⟨10.1109/TASL.2013.2265086⟩. ⟨hal-01577892⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UPMC CNRS IRCAM STMS SORBONNE-UNIVERSITE SU-SCIENCES

83 Consultations

500 Téléchargements

Multiobjective Time Series Matching for Audio Classification and Retrieval

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager