SALSAS: Sub-linear Active Learning Strategy with Approximate k-NN Search

David Gorisse 1 Matthieu Cord 2 Frédéric Precioso 1
1 MIDI - Multimedia Indexation and Data Integration
ETIS - Equipes Traitement de l'Information et Systèmes
2 MALIRE - Machine Learning and Information Retrieval
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : With the democratization of digital imaging devices, image databases exponentially grow. Thus, providing the user with a system for searching into these databases is a critical issue. However, bridging the semantic gap between which (semantic) concept(s) the user is looking for and the (semantic) content is quite difficult. In content-based image retrieval (CBIR) systems, a classic scenario is to formulate the user query, at first, with only one example (i.e. one image). In order to address this problem, active learning is a powerful technique which involves the user in interactively refining the query concept, through relevance feedback loops, by asking the user whether some strategically selected images are relevant or not. However, the complexity of state-of-the-art active learning methods is linear in the size of the database and thus dramatically slows down retrieval systems, when dealing with very large databases, which is no longer acceptable for users. In this article, we propose a strategy to overcome scalability limitations of active learning strategies by exploiting ultra fast k -nearest-neighbor (k -NN) methods, as locality sensitive hashing (LSH), and combining them with an active learning strategy dedicated to very large databases. We define a new LSH scheme adapted to χ2χ2 distance which often leads to better results in image retrieval context. We perform evaluation on databases between 5 K and 180 K images. The results show that our interactive retrieval system has a complexity almost constant in the size of the database. For a database of 180 K images, our system is 45 times faster than exhaustive search (linear scan) reaching similar accuracy.
Document type :
Journal articles
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00773102
Contributor : Michel Jordan <>
Submitted on : Friday, January 11, 2013 - 3:52:55 PM
Last modification on : Friday, October 4, 2019 - 12:14:02 PM

Identifiers

Citation

David Gorisse, Matthieu Cord, Frédéric Precioso. SALSAS: Sub-linear Active Learning Strategy with Approximate k-NN Search. Pattern Recognition, Elsevier, 2011, 44 (10-11), pp.2244-2254. ⟨10.1016/j.patcog.2010.12.009⟩. ⟨hal-00773102⟩

Share

Metrics

Record views

222