S. Amari, Information Geometry and Its Applications, 2016.

C. Bouveyron, S. Girard, and C. Schmid, High-dimensional data clustering, Computational Statistics and Data Analysis, vol.52, pp.502-519, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00548591

P. Buhlmann, P. Drineas, M. Kane, and . Van-der-laan, Handbook of Big Data, 2016.

O. Cappé and E. Moulines, On-line expectation-maximization algorithm for latent data models, Journal of the Royal Statistical Society B, vol.71, pp.593-613, 2009.

G. Celeux, S. Chretien, F. Forbes, and A. Mkhadri, A component-wise EM algorithm for mixtures, Journal of Computational and Graphical Statistics, vol.10, pp.697-712, 2001.
URL : https://hal.archives-ouvertes.fr/inria-00072916

M. Chau and M. C. Fu, An overview of stochastic approximation, Handbook of Simulation Optimization, pp.149-178, 2015.

H. Chen, Stochastic Approximiation and Its Applications, 2003.

A. Cotter, O. Shamir, N. Srebro, and K. Sridharan, Better mini-batch algorithms via accelerated gradient methods, Adavances in Neural Information Processing Systems, pp.1647-1655, 2011.

A. Dasgupta, Probability for Statistics and Machine Learning, 2011.

B. Delyon, M. Lavielle, and E. Moulines, Counvergence of a stochastic approximation version of the EM algorithm, Annals of Statistics, vol.27, pp.94-128, 1999.

A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society Series B, vol.39, pp.1-38, 1977.

D. Eddelbuettel, Seamless R and C++ Integration with Rcpp, 2013.

R. A. Fisher, The use of multiple measurments in taxonomic problems, Annals of Eugenics, pp.179-188, 1936.

C. Forbes, M. Evans, N. Hastings, and B. Peacock, Statistical Distributions, 2011.

C. Fraley, A. Raftery, and R. Wehrens, Incremental model-based clustering for large datasets with small clusters, Journal of Computation and Graphical Statistics, vol.14, pp.529-546, 2005.

S. Ghadimi, G. Lan, and H. Zhang, Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization, Mathematical Programming Series A, vol.155, pp.267-305, 2016.

Z. Han, M. Hong, and D. Wang, Signal Processing and Networking for Big Data Applications, 2017.

W. K. Hardle, H. H. Lu, .. Shen, and X. , Handbook of Big Data Analytics, 2018.

J. A. Hartigan and M. A. Wong, Algorithm AS 136: A k-means clustering algorithm, Journal of the Royal Statistical Society Series C, vol.28, pp.100-108, 1979.

L. Hubert and P. Arabie, Comparing partitions, Journal of Classification, vol.2, pp.193-218, 1985.

K. E. Iverson, A Programming Language, 1967.

I. T. Jolliffe, Principal Component Analysis, 2002.

P. N. Jones and G. J. Mclachlan, Fitting finite mixture models in a regression context, Australian Journal of Statistics, vol.34, pp.233-240, 1992.

J. Kiefer and J. Wolfowitz, Stochastic estimation of the maximum of a regression function, Annals of Mathematical Statistics, vol.23, pp.462-466, 1952.

S. Kullback and R. A. Leibler, On information and sufficiency, Annals of Mathematical Statistics, vol.22, pp.79-86, 1951.

H. J. Kushner and G. G. Yin, Stochastic Approximiation and Recursive Algorithms and Applications, 2003.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol.86, pp.2278-2324, 1998.

M. Li, T. Zhang, Y. Chen, and A. J. Smola, Efficient mini-batch training for stochastic optimization, Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.661-670, 2014.

F. Liang and J. Zhang, Estimating the false discovery rate using the stochastic approximation algorithm, Biometrika, vol.95, pp.961-977, 2008.

G. J. Mclachlan and T. Krishnan, The EM Algorithm And Extensions, 2008.

G. J. Mclachlan, S. X. Lee, and S. I. Rathnayake, Finite mixture models. Annual Review of Statistis and Its Application, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02415068

G. J. Mclachlan and D. Peel, Finite Mixture Models, 2000.
URL : https://hal.archives-ouvertes.fr/hal-02415068

V. Melnykov, W. Chen, and R. Maitra, MixSim: an R package for simulating data to study performance of clustering algorithms, Journal of Statistical Software, vol.51, pp.1-25, 2012.

S. Ng and G. J. Mclachlan, Speeding up the EM algorithm for mixture model-based segmentation of magnetic resonance images, Pattern Recognition, vol.37, pp.1573-1589, 2004.

H. D. Nguyen and F. Chamroukhi, Practical and theoretical aspects of mixture-of-experts modeling: an overview. WIREs Data Mining and Knowledge Discovery, p.1246, 2018.

H. D. Nguyen and A. T. Jones, Big Data-appropriate clustering via stochastic approximation and Gaussian mixture models, Data Analytics: Concepts, Techniques, and Applications, 2018.

H. D. Nguyen and G. J. Mclachlan, Maximum likelihood estimation of Gaussian mixture models without matrix operations, Advances in Data Analysis and Classification, vol.9, pp.371-394, 2015.

K. Pearson, Contributions to the theory of mathematical evolution, Philosophical Transactions of the Royal Society of London A, vol.185, pp.71-110, 1894.

B. T. Polyak, A new method of stochastic approximation type. Automatic and Remote Control, vol.51, pp.98-107, 1990.

B. T. Polyak and A. B. Juditsky, Acceleration of stochastic approximation by averaging, SIAM Journal of Control and Optimization, vol.30, pp.838-855, 1992.

A. Prosperetti, Advanced Mathematics for Applications, 2011.

. R-core-team, R: a language and environment for statistical computing. R Foundation for Statistical Computing, 2018.

H. Robbins and S. Monro, A stochastic approximation method, Annals of Mathematical Statistics, vol.22, pp.400-407, 1951.

E. Schubert, A. Koos, T. Emrich, A. Zufle, K. A. Schmid et al., A framework for clustering uncertain data, Proceedings of the VLDB Endowment, vol.8, pp.1976-1979, 2015.

L. Scrucca, M. Fop, T. B. Murphy, and A. E. Raftery, mclust: clustering, classification and density estimation using Gaussian finite mixture models, R Journal, vol.8, pp.289-317, 2016.

N. Vlassis and A. Likas, A greedy EM algorithm for Gaussian mixture learning, Neural Processing Letters, vol.15, pp.77-87, 2002.

H. White, Maximum likelihood estimation of misspecified models, Econometrica, vol.50, pp.1-25, 1982.

H. White, Asymptotic Theory For Econometricians, 2001.

H. Wickham, D. Cook, H. Hofmann, and A. Buja, tourr: an R package for exploring multivariate data with projections, Journal of Statistical Software, vol.40, pp.1-18, 2011.

C. F. Wu, On the convergence properties of the EM algorithm, Annals of Statistics, vol.11, pp.95-103, 1983.

L. Xu, M. I. Jordan, and G. E. Hinton, An alternative model for mixtures of experts, Advances in Neural Information Processing Systems, pp.633-640, 1995.

J. Zhang and F. Liang, Convergence of stochastic approximation algorithms under irregular conditions, Statistica Neerlandica, vol.62, pp.393-403, 2008.

T. Zhao, M. Yu, Y. Wang, R. Arora, and H. Liu, Accelerated mini-batch randomized block coordinate descent method, Advances in Neural Information Processing Systems, pp.3329-3337, 2014.