Cluster-based data oriented hashing - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Cluster-based data oriented hashing

Résumé

Many multidimensional hashing schemes have been actively studied in recent years, providing efficient nearest neighbor search. Generally, we can distinguish several hashing families, such as learning based hashing, which provides better hash function selectivity by learning the dataset distribution. The spacial hashing family proposes a suitable partition of the multidimensional space, more adapted to data points distribution. In spite of the efficiency of multidimensional hashing techniques to solve the nearest neighbor search problem, these techniques suffer from scalabity issues. In this paper, we propose a novel hashing algorithm, named Cluster Based Data Oriented Hashing, that combines space hashing and learning based hashing techniques. The proposed approach applies first a clustering algorithm for structuring the multidimensional space into clusters. Then, in each cluster, a learning based hashing algorithm is applied by selecting an appropriate hash function that fits the data distribution. Experimental comparisons with standard Euclidean Locality Sensitive Hashing demonstrate the effectiveness of the proposed method for large datasets
Fichier non déposé

Dates et versions

hal-01262481 , version 1 (26-01-2016)

Identifiants

Citer

Sanaa Chafik, Imane Daoudi, Mounim El Yacoubi, Hamid El Ouardi. Cluster-based data oriented hashing. DSAA 2015 : 2nd International Conference on Data Science and Advanced Analytics, Oct 2015, Paris, France. pp.1 - 7, ⟨10.1109/DSAA.2015.7344895⟩. ⟨hal-01262481⟩
80 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More