Being prepared in a sparse world: the case of KNN graph construction

Antoine Boutet 1 Anne-Marie Kermarrec 2 Nupur Mittal 2 François Taïani 2
1 DRIM - Distribution, Recherche d'Information et Mobilité
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
2 ASAP - As Scalable As Possible: foundations of large scale dynamic distributed systems
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
Abstract : K-Nearest-Neighbor (KNN) graphs have emerged as a fundamental building block of many on-line services providing recommendation, similarity search and classification. Constructing a KNN graph rapidly and accurately is, however, a computationally intensive task. As data volumes keep growing, speed and the ability to scale out are becoming critical factors when deploying a KNN algorithm. In this work, we present KIFF, a generic, fast and scalable KNN graph construction algorithm. KIFF directly exploits the bipartite nature of most datasets to which KNN algorithms are applied. This simple but powerful strategy drastically limits the computational cost required to rapidly converge to an accurate KNN solution, especially for sparse datasets. Our evaluation on a representative range of datasets show that KIFF provides, on average, a speed-up factor of 14 against recent state-of-the art solutions while improving the quality of the KNN approximation by 18%.
Complete list of metadatas

https://hal.inria.fr/hal-01251010
Contributor : Antoine Boutet <>
Submitted on : Thursday, June 30, 2016 - 12:20:02 PM
Last modification on : Thursday, February 7, 2019 - 4:49:47 PM
Long-term archiving on : Saturday, October 1, 2016 - 10:42:36 AM

File

PID4067857.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01251010, version 1

Citation

Antoine Boutet, Anne-Marie Kermarrec, Nupur Mittal, François Taïani. Being prepared in a sparse world: the case of KNN graph construction. International Conference on Data Engineering (ICDE), May 2016, Helsinki, Finland. ⟨hal-01251010⟩

Share

Metrics

Record views

1199

Files downloads

662