Skip to Main content Skip to Navigation
Conference papers

Local Summarization and Multi-Level LSH for Retrieving Multi-Variant Audio Tracks

Yi Yu Michel Crucianu 1 Vincent Oria Lei Chen 
1 CEDRIC - VERTIGO - CEDRIC. Données complexes, apprentissage et représentations
CEDRIC - Centre d'études et de recherche en informatique et communications
Abstract : In this paper we study the problem of detecting and grouping multi-variant audio tracks in large audio datasets. To address this issue, a fast and reliable retrieval method is necessary. But reliability requires elaborate representations of audio content, which challenges fast retrieval by similarity from a large audio database. To find a better tradeoff between retrieval quality and e±ciency, we put forward an approach relying on local summarization and multi-level Locality-Sensitive Hashing (LSH). More precisely, each audio track is divided into multiple Continuously Correlated Periods (CCP) of variable length according to spectral similarity. The description for each CCP is calculated based on its Weighted Mean Chroma (WMC). A track is thus represented as a sequence of WMCs. Then, an adapted two-level LSH is employed for e±ciently delineating a narrow relevant search region.The coarse hashing level restricts search to items having a non-negligible similarity to the query. The subsequent, refined level only returns items showing a much higher similarity. Experimental evaluations performed on a real multi-variant audio dataset confirm that our approach supports fast and reliable retrieval of audio track variants.
Document type :
Conference papers
Complete list of metadata
Contributor : Laboratoire CEDRIC Connect in order to contact the contributor
Submitted on : Friday, March 6, 2015 - 11:24:03 AM
Last modification on : Friday, August 5, 2022 - 2:54:01 PM


  • HAL Id : hal-01125700, version 1



Yi Yu, Michel Crucianu, Vincent Oria, Lei Chen. Local Summarization and Multi-Level LSH for Retrieving Multi-Variant Audio Tracks. MM'09: ACM Multimedia, Beijing, China, Jan 2009, X, France. pp.341-350. ⟨hal-01125700⟩



Record views