Taming the curse of dimensionality for perturbed token identification - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Taming the curse of dimensionality for perturbed token identification

Jérémy Rouot
Ehsan Sedgh-Gooya
  • Fonction : Auteur
  • PersonId : 745704
  • IdHAL : esedghgo

Résumé

In the context of data tokenization, we model a token as a vector of a finite dimensional metric space E and given a finite subset of E, called the token set, we address the problem of deciding whether a given token is in a small neighborhood of an other token. We derive conditions to characterize the nearest token of a given one and show that these conditions are fulfilled asymptotically as the dimension of E tends to infinity. Whereas the classical nearest neighbor search is inefficient to solve such problem, we propose a new probabilistic algorithm, which becomes efficient if the dimension of E is large enough.
Fichier principal
Vignette du fichier
token2020.pdf (291.89 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02865114 , version 1 (30-06-2020)
hal-02865114 , version 2 (01-09-2020)

Identifiants

  • HAL Id : hal-02865114 , version 2

Citer

Olga Assainova, Jérémy Rouot, Ehsan Sedgh-Gooya. Taming the curse of dimensionality for perturbed token identification. 10th International Conference on Image Processing Theory, Tools and Applications, Nov 2020, Paris, France. ⟨hal-02865114v2⟩
206 Consultations
161 Téléchargements

Partager

Gmail Facebook X LinkedIn More