On the visualization of high-dimensional data
Résumé
Computing distances in high-dimensional spaces is deemed with the empty space phenomenon, which may harm distance-based algorithms for data visualization. We focus on transforming high-dimensional numeric data for their visualization using the kernel PCA 2D projection. Gaussian and p-Gaussian kernels are often advocated when confronted to such data; we propose to give some insight of their properties and behaviour in the context of a 2D projection for visualization. An alternative approach, that directly impacts the distribution of distances, is proposed. It also allows the indirect control of the distribution of the eventual kernel values as generated by the Gaussian kernel function. Finally, such projections induce some artifacts, which, if not handled, should not be ignored.
Domaines
Autre [cs.OH]
Origine : Fichiers produits par l'(les) auteur(s)
Loading...