Near-neighbor preserving dimension reduction via coverings for doubling subsets of ℓ1 - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Theoretical Computer Science Année : 2022

Near-neighbor preserving dimension reduction via coverings for doubling subsets of ℓ1

Résumé

Randomized dimensionality reduction has been recognized as one of the cornerstones in handling high-dimensional data, originating in various foundational works such as the celebrated Johnson-Lindenstrauss Lemma. More specifically, nearest neighbor-preserving embeddings exist for L2 (Euclidean) and L1 (Manhattan) metrics, as well as doubling subsets of L1, where doubling dimension is today the most effective way of capturing intrinsic dimensionality, as well as input structure in various applications. These randomized embeddings bound the distortion only for distances between the query point and a point set. Motivated by the foundational character of fast Approximate Nearest Neighbor search in L1, this paper settles an important missing case, namely that of doubling subsets of L1. In particular, we introduce a randomized dimensionality reduction by means of a near neighbor-preserving embedding, which is related to the decision-with-witness problem. The input set gets represented with a carefully chosen covering point set; in a second step, the algorithm randomly projects the latter. In order to obtain the covering point sets, we leverage either approximate r-nets or randomly shifted grids, with different tradeoffs between preprocessing time and target dimension. We exploit Cauchy random variables, and derive a concentration bound of independent interest. Our algorithms are rather simple and should therefore be useful in practice.
Fichier principal
Vignette du fichier
EmMaPs-journal.pdf (428.18 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04294296 , version 1 (28-11-2023)

Licence

Paternité

Identifiants

Citer

Ioannis Z. Emiris, Vasilis Margonis, Ioannis Psarros. Near-neighbor preserving dimension reduction via coverings for doubling subsets of ℓ1. Theoretical Computer Science, 2022, 942, pp.169-179. ⟨10.1016/j.tcs.2022.11.031⟩. ⟨hal-04294296⟩
15 Consultations
2 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More