D. Arthur and S. Vassilvitskii, k-means++: The advantages of careful seeding, Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 2007.

F. Bach and Z. Harchaoui, DIFFRAC : a discriminative and flexible framework for clustering, Adv. NIPS, 2007.

F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, Optimization with Sparsity-Inducing Penalties, Machine Learning, pp.1-106, 2011.
DOI : 10.1561/2200000015

URL : https://hal.archives-ouvertes.fr/hal-00613125

A. Beck and M. Teboulle, A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems, SIAM Journal on Imaging Sciences, vol.2, issue.1, 2009.
DOI : 10.1137/080716542

R. Bellman, A note on cluster analysis and dynamic programming, Mathematical Biosciences, vol.18, issue.3-4, 1973.
DOI : 10.1016/0025-5564(73)90007-2

G. Blanchard, M. Kawanabe, M. Sugiyama, V. Spokoiny, and K. Müller, In search of non-Gaussian components of a high-dimensional distribution, The Journal of Machine Learning Research, vol.7, pp.247-282, 2006.

N. Boumal, B. Mishra, P. Absil, and R. Sepulchre, Manopt, a Matlab Toolbox for Optimization on Manifolds, Journal of Machine Learning Research, 2014.

J. Bourgain, V. H. Vu, and P. M. Wood, On the singularity probability of discrete random matrices, Journal of Functional Analysis, vol.258, issue.2, pp.559-603, 2010.
DOI : 10.1016/j.jfa.2009.04.016

S. P. Boyd and L. Vandenberghe, Convex Optimization, 2004.

F. De-la-torre and T. Kanade, Discriminative cluster analysis, Proceedings of the 23rd international conference on Machine learning , ICML '06, 2006.
DOI : 10.1145/1143844.1143875

E. Diederichs, A. Juditsky, A. Nemirovski, and V. Spokoiny, Sparse non Gaussian component analysis by semidefinite programming, Machine Learning, vol.290, issue.2, pp.211-238, 2013.
DOI : 10.1007/s10994-013-5331-1

URL : https://hal.archives-ouvertes.fr/hal-00978264

C. Ding and T. Li, -means clustering, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273562

URL : https://hal.archives-ouvertes.fr/cea-01058940

D. Freedman, Statistical models: theory and practice, 2009.
DOI : 10.1017/CBO9780511815867

J. H. Friedman and W. Stuetzle, Projection Pursuit Regression, Journal of the American Statistical Association, vol.4, issue.376, pp.817-823, 1981.
DOI : 10.1080/01621459.1981.10477729

A. Frieze and M. Jerrum, Improved approximation algorithms for MAX k-CUT and MAX BISECTION, Integer Programming and Combinatorial Optimization, 1995.
DOI : 10.1007/3-540-59408-6_37

M. R. Garey, D. S. Johnson, and L. Stockmeyer, Some simplified NP-complete graph problems, Theoretical Computer Science, vol.1, issue.3, pp.237-267, 1976.
DOI : 10.1016/0304-3975(76)90059-1

M. X. Goemans and D. P. Williamson, Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming, Journal of the ACM, vol.42, issue.6, pp.1115-1145, 1995.
DOI : 10.1145/227683.227684

J. C. Gower and G. J. Ross, Minimum Spanning Trees and Single Linkage Cluster Analysis, Applied Statistics, vol.18, issue.1, 1969.
DOI : 10.2307/2346439

M. Grant and S. Boyd, Graph Implementations for Nonsmooth Convex Programs, Recent Advances in Learning and Control, Lecture Notes in Control and Information Sciences, pp.95-110, 2008.
DOI : 10.1007/978-1-84800-155-8_7

A. Hyvärinen, J. Karhunen, and E. Oja, Independent component analysis, 2004.

A. Joulin and F. Bach, A convex relaxation for weakly supervised classifiers, Proc. ICML, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00717450

A. Joulin, F. Bach, and J. Ponce, Discriminative clustering for image co-segmentation, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5539868

A. Joulin, J. Ponce, and F. Bach, Efficient optimization for discriminative latent class models, Adv. NIPS, 2010.

M. Journée, F. Bach, P. Absil, and R. Sepulchre, Low-Rank Optimization on the Cone of Positive Semidefinite Matrices, SIAM Journal on Optimization, vol.20, issue.5, 2010.
DOI : 10.1137/080731359

R. M. Karp, Reducibility among combinatorial problems, In Complexity of computer computations, pp.85-103, 1972.

L. Roux and F. Bach, Local component analysis, Proceedings of the International Conference on Learning Representations, 2013.
URL : https://hal.archives-ouvertes.fr/inria-00617965

Z. Q. Luo, W. K. Ma, A. C. So, Y. Ye, and S. Zhang, Semidefinite Relaxation of Quadratic Optimization Problems, IEEE Signal Processing Magazine, vol.27, issue.3, 2010.
DOI : 10.1109/MSP.2010.936019

J. B. Macqueen, Some Methods for Classification and Analysis of MultiVariate Observations, Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, pp.281-297, 1967.

Y. Nesterov, Smoothing technique and its applications in semidefinite optimization, Math. Program, 2007.

A. Y. Ng, M. I. Jordan, and Y. Weiss, On Spectral Clustering: Analysis and an algorithm, Adv. NIPS, 2002.

P. Schönemann, A generalized solution of the orthogonal procrustes problem, Psychometrika, vol.22, issue.1, 1966.
DOI : 10.1007/BF02289451