Algebraic Properties to Optimize kNN Queries

Abstract : New applications that are being required to employ Database Management Systems (DBMSs), such as storing and retrieving complex data (images, sound, temporal series, genetic data, etc.) and analytical data processing (data mining, social networks analysis, etc.), increasingly impose the need for new ways of expressing predicates. Among the new most studied predicates are the similarity-based ones, where the two commonest are the similarity range and the k-nearest neighbor predicates. The k-nearest neighbor predicate is surely the most interesting for several applications, including Content-Based Image Retrieval (CBIR) and Data Mining (DM) tasks, yet it is also the most expensive to be evaluated. A strong motivation to include operators to execute the k-nearest neighbor predicate inside a DBMS is to employ the powerful resource of query rewriting following algebraic properties to optimize query execution. Unfortunately, too few properties of the k-nearest neighbor operator have been identified so far that allow query rewriting rules leading to effectively more efficient query execution. In fact, a k-nearest neighbor operator does not even commute with either other k-nearest neighbor operator or any other attribute comparison operators (similarity range or any other of the traditional attribute comparison operator). In this paper, we investigate a new class of properties for the k-nearest neighbor operator based not on expression equivalence, but on result set inclusion. We develop a complete set of properties based on set inclusion, which can be successfully employed to rewrite query expressions involving k-nearest neighbor operators combined to any of the traditional attribute comparison operators or to other k-nearest neighbor and similarity range operators. We also give examples of how applying those properties to rewrite queries improve retrieval efficiency.
Type de document :
Article dans une revue
Journal of Information and Data Management, Brazilian Computer Society, 2011, 2 (3), pp.385--400
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00687320
Contributeur : Mônica Ribeiro Porto Ferreira <>
Soumis le : vendredi 13 avril 2012 - 12:48:28
Dernière modification le : mardi 10 mars 2015 - 19:44:35
Document(s) archivé(s) le : samedi 14 juillet 2012 - 02:28:12

Fichier

JIDM_2011.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00687320, version 1

Collections

Citation

Mônica Ribeiro Porto Ferreira, Lucio Fernandes Dutra Santos, Agma Juci Machado Traina, Ires Dias, Richard Chbeir, et al.. Algebraic Properties to Optimize kNN Queries. Journal of Information and Data Management, Brazilian Computer Society, 2011, 2 (3), pp.385--400. 〈hal-00687320〉

Partager

Métriques

Consultations de
la notice

223

Téléchargements du document

127