M. Agueh and G. Carlier, Barycenters in the wasserstein space, SIAM Journal on Mathematical Analysis, vol.43, pp.904-924, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00637399

M. Aharon, M. Elad, and A. Bruckstein, An algorithm for designing overcomplete dictionaries for sparse representation, IEEE Transactions on signal processing, vol.54, pp.4311-4322, 2006.

N. Aifanti, C. Papachristou, and A. Delopoulos, The MUG facial expression database, Image Analysis for Multimedia Interactive Services (WIAMIS), 2010 11th International Workshop on, pp.1-4, 2010.

J. Altschuler, J. Weed, and P. Rigollet, Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration, 2017.

M. Arjovsky, S. Chintala, L. Bottou, and G. Wasserstein, , 2017.

F. Bassetti, A. Bodini, and E. Regazzini, On minimum kantorovich distance estimators, Statistics & probability letters, vol.76, pp.1298-1302, 2006.

F. Bassetti and E. Regazzini, Asymptotic properties and robustness of minimum dissimilarity estimators of location-scale parameters, Theory of Probability & Its Applications, vol.50, pp.171-186, 2006.

J. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyré, Iterative bregman projections for regularized transportation problems, SIAM Journal on Scientific Computing, vol.37, pp.1111-1138, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01096124

E. Bernton, P. E. Jacob, M. Gerber, and C. P. Robert, Inference in generative models using the wasserstein distance, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01517550

D. P. Bertsekas, The auction algorithm: A distributed relaxation method for the assignment problem, Annals of operations research, vol.14, pp.105-123, 1988.

J. Bigot, R. Gouet, T. Klein, and A. López, Geodesic pca in the wasserstein space, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01978864

D. Blei and J. Lafferty, Topic models, Text mining: classification, clustering, and applications, vol.10, p.71, 2009.

E. Boissard, T. L. Gouic, and J. Loubes, Distribution template estimate with Wasserstein metrics, Bernoulli, vol.21, pp.740-759, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01291302

N. Bonneel, G. Peyré, and M. Cuturi, Wasserstein barycentric coordinates: Histogram regression using optimal transport, Proceedings of SIGGRAPH 2016, p.35, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01303148

N. Bonneel, J. Rabin, G. Peyré, and H. Pfister, Sliced and Radon Wasserstein Barycenters of Measures, Journal of Mathematical Imaging and Vision, vol.51, pp.22-45, 2015.
URL : https://hal.archives-ouvertes.fr/hal-00881872

G. Carlier, A. Oberman, and E. Oudet, Numerical methods for matching for teams and Wasserstein barycenters, ESAIM: Mathematical Modelling and Numerical Analysis, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01112224

L. Chizat, G. Peyré, B. Schmitzer, and F. Vialard, Scaling algorithms for unbalanced transport problems, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01434914

M. Cuturi, Sinkhorn distances: Lightspeed computation of optimal transport, Advances in Neural Information Processing Systems, pp.2292-2300, 2013.

M. Cuturi and A. Doucet, Fast computation of wasserstein barycenters, Proceedings of The 31st International Conference on Machine Learning, pp.685-693, 2014.

M. Cuturi and G. Peyré, A smoothed dual approach for variational wasserstein problems, SIAM Journal on Imaging Sciences, vol.9, pp.320-343, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01188954

A. Aspremont, L. E. Ghaoui, M. I. Jordan, and G. R. Lanckriet, A direct formulation for sparse pca using semidefinite programming, SIAM review, vol.49, pp.434-448, 2007.

W. E. Deming and F. F. Stephan, On a least squares adjustment of a sampled frequency table when the expected marginal totals are known, The Annals of Mathematical Statistics, vol.11, pp.427-444, 1940.

S. Erlander and N. F. Stewart, The gravity model in transportation analysis: theory and extensions, vol.3, 1990.

P. T. Fletcher, C. Lu, S. M. Pizer, and S. Joshi, Principal geodesic analysis for the study of nonlinear statistics of shape, IEEE Transactions on Medical Imaging, vol.23, pp.995-1005, 2004.

J. Franklin and J. Lorenz, On the scaling of multidimensional matrices, Linear Algebra and its applications, vol.114, pp.717-735, 1989.

M. Fréchet, Leséléments aléatoires de nature quelconque dans un espace distancié, Ann. Inst. H. Poincaré, vol.10, pp.215-310, 1948.

C. Frogner, C. Zhang, H. Mobahi, M. Araya, and T. A. Poggio, Learning with a wasserstein loss, Advances in Neural Information Processing Systems, pp.2053-2061, 2015.

W. Gao, J. Chen, C. Richard, and J. Huang, Online dictionary learning for kernel lms, IEEE Transactions on Signal Processing, vol.62, pp.2765-2777, 2014.

A. Genevay, G. Peyré, and M. Cuturi, Learning generative models with sinkhorn divergences, 2017.

A. Griewank and A. Walther, Evaluating derivatives: principles and techniques of algorithmic differentiation, 2008.

S. Haker, L. Zhu, A. Tannenbaum, and S. Angenent, Optimal mass transport for registration and warping, International Journal of Computer Vision, vol.60, pp.225-240, 2004.

M. Harandi and M. Salzmann, Riemannian coding and dictionary learning: Kernels to the rescue, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3926-3935, 2015.

M. Harandi, C. Sanderson, C. Shen, and B. C. Lovell, Dictionary learning and sparse coding on grassmann manifolds: An extrinsic solution, Proceedings of the IEEE International Conference on Computer Vision, pp.3120-3127, 2013.

M. T. Harandi, C. Sanderson, R. Hartley, and B. C. Lovell, Sparse coding and dictionary learning for symmetric positive definite matrices: A kernel approach, Computer Vision-ECCV 2012, pp.216-229, 2012.

G. E. Hinton and R. R. Salakhutdinov, Reducing the dimensionality of data with neural networks, science, pp.504-507, 2006.

J. Ho, Y. Xie, and B. Vemuri, On a nonlinear generalization of sparse coding and dictionary learning, International conference on machine learning, pp.1480-1488, 2013.

A. Hyvärinen, J. Karhunen, and E. Oja, , vol.46, 2004.

Z. Irace and H. Batatia, Motion-based interpolation to estimate spatially variant psf in positron emission tomography, Signal Processing Conference (EUSIPCO), 2013 Proceedings of the 21st European, pp.1-5, 2013.

H. W. Kuhn, The hungarian method for the assignment problem, Naval research logistics quarterly, vol.2, pp.83-97, 1955.

R. Laureijs, J. Amiaux, S. Arduini, J. Augueres, J. Brinchmann et al., Euclid definition study report, 2011.
URL : https://hal.archives-ouvertes.fr/in2p3-00712239

D. D. Lee and H. S. Seung, Learning the parts of objects by non-negative matrix factorization, Nature, vol.401, pp.788-791, 1999.

H. Lee, A. Battle, R. Raina, and A. Y. Ng, Efficient sparse coding algorithms, Advances in neural information processing systems, pp.801-808, 2007.

C. Léonard, A survey of the Schrödinger problem and some of its connections with optimal transport, Discrete and Continuous Dynamical Systems -Series A (DCDS-A), vol.34, pp.1533-1574, 2014.

P. Li, Q. Wang, W. Zuo, and L. Zhang, Log-euclidean kernels for sparse representation and dictionary learning, Proceedings of the IEEE International Conference on Computer Vision, pp.1601-1608, 2013.

H. Liu, J. Qin, H. Cheng, and F. Sun, Robust kernel dictionary learning using a whole sequence convergent algorithm, IJCAI, vol.1, p.5, 2015.

J. Mairal, F. Bach, J. Ponce, and G. Sapiro, Online learning for matrix factorization and sparse coding, Journal of Machine Learning Research, vol.11, pp.19-60, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00408716

S. Mallat, A wavelet tour of signal processing, 1999.

R. J. Mccann, A convexity principle for interacting gases, Advances in mathematics, vol.128, pp.153-179, 1997.

Q. Mérigot, A Multiscale Approach to Optimal Transport, Computer Graphics Forum, 2011.

G. Monge, Mémoire sur la théorie des déblais et des remblais, Histoire de l'Académie Royale des Sciences de Paris, 1781.

G. Montavon, K. Müller, and M. Cuturi, Wasserstein training of restricted boltzmann machines, Advances in Neural Information Processing Systems, pp.3711-3719, 2016.

J. L. Morales and J. , Remark on "algorithm 778: L-bfgs-b: Fortran subroutines for large-scale bound constrained optimization, ACM Transactions on Mathematical Software (TOMS), vol.38, p.7, 2011.

Y. Nesterov, Introductory lectures on convex optimization: A basic course, vol.87, 2013.

F. Ngolè and J. Starck, Psf field learning based on optimal transport distances, 2017.

F. Ngolè, J. Starck, S. Ronayette, K. Okumura, and J. Amiaux, Super-resolution method using sparse regularization for point-spread function recovery, Astronomy & Astrophysics, vol.575, p.86, 2015.

N. Papadakis, Optimal Transport for Image Processing, habilitationà diriger des recherches, 2015.

K. Pearson and L. , on lines and planes of closest fit to systems of points in space, Journal of Science, vol.2, pp.559-572, 1901.

J. Pennington, R. Socher, and C. D. Manning, Glove: Global Vectors for Word Representation, EMNLP, vol.14, pp.1532-1543, 2014.

G. Peyré, L. Chizat, F. Vialard, and J. Solomon, Quantum optimal transport for tensor field processing, 2016.

F. Pitié, A. C. Kokaram, and R. Dahyot, N-dimensional probablility density function transfer and its application to colour transfer, Proceedings of the Tenth IEEE International Conference on Computer Vision, vol.2, pp.1434-1439, 2005.

B. T. Polyak, Some methods of speeding up the convergence of iteration methods, USSR Computational Mathematics and Mathematical Physics, vol.4, pp.1-17, 1964.

Y. Quan, C. Bao, and H. Ji, Equiangular kernel dictionary learning with applications to dynamic texture analysis, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.308-316, 2016.

J. Rabin, G. Peyré, J. Delon, and M. Bernot, Wasserstein barycenter and its application to texture mixing, International Conference on Scale Space and Variational Methods in Computer Vision, pp.435-446, 2011.

S. Rachev and L. Rüschendorf, Mass Transportation Problems: Theory, vol.1, 1998.

A. Rolet, M. Cuturi, and G. Peyré, Fast dictionary learning with a smoothed wasserstein loss, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, pp.630-638, 2016.

R. Rubinstein, M. Zibulevsky, and M. Elad, Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit, Cs Technion, vol.40, pp.1-15, 2008.

Y. Rubner, C. Tomasi, and L. J. Guibas, The earth mover's distance as a metric for image retrieval, Int. J. Comput. Vision, vol.40, pp.99-121, 2000.

G. Salton and M. J. Mcgill, Introduction to modern information retrieval, 1986.

R. Sandler and M. Lindenbaum, Nonnegative matrix factorization with earth mover's distance metric, in Computer Vision and Pattern Recognition, CVPR 2009. IEEE Conference on, pp.1873-1880, 2009.

M. A. Schmitz, M. Heitz, N. Bonneel, F. Ngolè, D. Coeurjolly et al., Optimal transport-based dictionary learning and its application to euclid-like point spread function representation, SPIE Optical Engineering+ Applications, International Society for Optics and Photonics, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01635342

B. Schmitzer, Stabilized sparse scaling algorithms for entropy regularized transport problems, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01385251

B. Schölkopf, A. Smola, and K. Müller, Kernel principal component analysis, Artificial Neural Networks -ICANN'97, pp.583-588, 1997.

E. Schrödinger, Über die umkehrung der naturgesetze, Verlag Akademie der wissenschaften in kommission bei, 1931.

V. Seguy and M. Cuturi, Principal geodesic analysis for probability measures under the optimal transport metric, Advances in Neural Information Processing Systems, pp.3312-3320, 2015.

S. Shirdhonkar and D. W. Jacobs, Approximate earth mover's distance in linear time, Computer Vision and Pattern Recognition, pp.1-8, 2008.

R. Sinkhorn, Diagonal equivalence to matrices with prescribed row and column sums, The American Mathematical Monthly, vol.74, pp.402-405, 1967.

J. Solomon, F. De-goes, G. Peyré, M. Cuturi, A. Butscher et al., Convolutional wasserstein distances: Efficient optimal transportation on geometric domains, ACM Transactions on Graphics (TOG), vol.34, p.66, 2015.

J. Solomon, R. Rustamov, L. Guibas, and A. Butscher, Wasserstein propagation for semi-supervised learning, Proceedings of The 31st International Conference on Machine Learning, pp.306-314, 2014.

M. Talagrand, Transportation cost for gaussian and other product measures, Geometric and Functional Analysis, vol.6, pp.587-600, 1996.

, Theano: A Python framework for fast computation of mathematical expressions, arXiv e-prints, 2016.

M. Turk and A. Pentland, Eigenfaces for Recognition, Journal of Cognitive Neuroscience, vol.3, pp.71-86, 1991.

H. Van-nguyen, V. M. Patel, N. M. Nasrabadi, and R. Chellappa, Design of non-linear kernel dictionaries for object recognition, IEEE Transactions on Image Processing, vol.22, pp.5123-5135, 2013.

C. Villani, Topics in optimal transportation, Optimal transport: old and new, vol.338, 2003.

W. Wang, D. Slepcev, S. Basu, J. A. Ozolek, and G. K. Rohde, A linear optimal transportation framework for quantifying and visualizing variations in sets of images, International Journal of Computer Vision, vol.101, pp.254-269, 2013.

J. Ye, P. Wu, J. Z. Wang, and J. Li, Fast discrete distribution clustering using Wasserstein barycenter with sparse support, IEEE Transactions on Signal Processing, vol.65, pp.2317-2332, 2017.

S. Zavriev and F. Kostyuk, Heavy-ball method in nonconvex optimization problems, Computational Mathematics and Modeling, vol.4, pp.336-341, 1993.