k-means++: the advantages of careful seeding, Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp.1027-1035, 2007. ,
On the optimality of conditional expectation as a Bregman predictor, IEEE Transactions on Information Theory, vol.51, 2005. ,
Clustering with Bregman divergences, Journal of Machine Learning Research, vol.6, pp.1705-1749, 2005. ,
On the performance of clustering in Hilbert spaces, IEEE Trans. Inform. Theory, vol.54, pp.18-9448, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00290855
Theory of classification: a survey of some recent advances, ESAIM Probab. Stat, vol.9, pp.1292-8100, 2005. ,
URL : https://hal.archives-ouvertes.fr/hal-00017923
Concentration inequalities. A nonasymptotic theory of independence, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00777381
The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming, USSR Computational Mathematics and Mathematical Physics, vol.7, pp.200-217, 1967. ,
Empirical risk minimization for heavy-tailed losses, Ann. Statist, vol.43, pp.90-5364, 2015. ,
Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm, In: Bernoulli, vol.19, pp.18-43, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00558481
Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector, 2018. ,
Prediction, Learning, and Games, 2006. ,
Geometric Inference for Measures based on Distance Functions, Foundations of Computational Mathematics, vol.11, pp.733-751, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00383685
Geometric Inference for Probability Measures, Foundations of Computational Mathematics archive, vol.11, pp.733-751, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00772444
Persistence-based clustering in Riemannian manifolds, J. ACM, vol.60, 2013. ,
URL : https://hal.archives-ouvertes.fr/inria-00389390
Fitting Models to Daily Rainfall Data, Journal of Applied Meteorology, vol.21, pp.1024-1031, 1982. ,
Trimmed k-means: an attempt to robustify quantizers, Ann. Statist, vol.25, pp.553-576, 1997. ,
The notion of breakdown point, pp.157-184, 1983. ,
Pattern Classification, 2000. ,
A general trimming approach to robust cluster analysis, Ann. Statist, vol.36, issue.3, pp.90-5364, 2008. ,
A simple LNRE model for random character sequences, Proceedings of the 7èmes Journées Internationales d'Analyse Statistique des Données Textuelles, pp.411-422, 2004. ,
Quantization and clustering with Bregman divergences, J. Multivariate Anal, vol.101, pp.47-259, 2010. ,
tclust: An R Package for a Trimming Approach to Cluster Analysis, Journal of Statistical Software, vol.47, pp.1-26, 2012. ,
Best approximations to random variables based on trimming procedures, J. Approx. Theory, vol.64, pp.162-180, 1991. ,
Distortion measures for speech processing, IEEE Transactions on Acoustics, Speech and Signal Processing, vol.28, pp.367-376, 1980. ,
dbscan: Fast Density-Based Clustering with R, Journal of Statistical Software, vol.91, pp.1-30, 2019. ,
Clustering with spectral norm and the k-means algorithm, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science-FOCS 2010, pp.299-308, 2010. ,
Robust machine learning by median-of-means : theory and practice, 2017. ,
Nonasymptotic bounds for vector quantization in Hilbert spaces, Ann. Statist, vol.43, issue.2, pp.592-619, 2015. ,
Quantization/Clustering: when and why does k-means work?, In: JSFdS 159, vol.1, pp.1-26, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01667014
Learning-theoretic methods in vector quantization, Principles of nonparametric learning, vol.434, pp.163-210, 2001. ,
Least squares quantization in PCM, IEEE Transactions on Information Theory, vol.28, pp.129-137, 1982. ,
, Robust statistics. Wiley Series in Probability and Statistics. Theory and methods, 2006.
Entropy and the combinatorial dimension, In: Invent. Math, vol.152, pp.37-55, 2003. ,
Bregman Voronoi diagrams: properties, algorithms and applications, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00137865
, Convex Analysis, 1970.
Cluster ensembles -A knowledge reuse framework for combining multiple partitions, Journal of Machine Learning Research, vol.3, pp.583-617, 2002. ,
On Lloyd's Algorithm: New Theoretical Insights for Clustering in Practice, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, vol.51, pp.1280-1289, 2016. ,
, Humanities Data in R: Exploring Networks, Geospatial Data, Images, and Text. 1st ed. 2015. Quantitative Methods in the Humanities and Social Sciences, 2015.
, 25) and c 3 = (40, 40), I 2 the identity matrix on R 2 and ? = (? 1 , ? 2 , ? 3 ). The first distribution L 1 corresponds to clusters with the same variance, with ? = (5, 5, 5), the second distribution L 2 to clusters with increasing variance, with ? = (1, 4, 7), and the third distribution L 3 to clusters with increasing and decreasing variance, vol.10