Log-PCA versus Geodesic PCA of histograms in the Wasserstein space

Abstract : This paper is concerned by the statistical analysis of data sets whose elements are random histograms. For the purpose of learning principal modes of variation from such data, we consider the issue of computing the PCA of histograms with respect to the 2-Wasserstein distance between probability measures. To this end, we propose to compare the methods of log-PCA and geodesic PCA in the Wasserstein space as introduced by Bigot et al. (2015) and Seguy and Cuturi (2015). Geodesic PCA involves solving a non-convex optimization problem. To solve it approximately, we propose a novel forward-backward algorithm. This allows a detailed comparison between log-PCA and geodesic PCA of one-dimensional histograms, which we carry out using various data sets, and stress the benefits and drawbacks of each method. We extend these results for two-dimensional data and compare both methods in that setting.
Document type :
Journal articles
Liste complète des métadonnées

Cited literature [17 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01581699
Contributor : Vivien Seguy <>
Submitted on : Tuesday, September 5, 2017 - 6:01:55 AM
Last modification on : Tuesday, April 2, 2019 - 2:27:12 AM

File

HistPCAW2_arxiv.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01581699, version 1
  • ARXIV : 1708.08143

Citation

Elsa Cazelles, Vivien Seguy, Jérémie Bigot, Marco Cuturi, Nicolas Papadakis. Log-PCA versus Geodesic PCA of histograms in the Wasserstein space. SIAM Journal on Scientific Computing, Society for Industrial and Applied Mathematics, 2018, 40 (2), pp.B429-B456. ⟨hal-01581699⟩

Share

Metrics

Record views

406

Files downloads

153