. Wiley-blackwell, Chapter 18, pp.93-97

A. Alamsyah and B. Nurriz, Monte Carlo simulation and clustering for customer segmentation in business organization, 3rd International Conference on Science and Technology-Computer, pp.104-109, 2017.

E. Charles and . Antoniak, Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems, Ann. Statist, vol.2, issue.6, pp.1152-1174, 1974.

J. Dean and S. Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, Commun. ACM, vol.51, pp.107-113, 2008.

T. Debatty, P. Michiardi, W. Mees, and O. Thonnard, Determining the k in k-means with MapReduce, EDBT/ICDT Workshops, pp.19-28, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01525708

P. Arthur, N. M. Dempster, D. Laird, and . Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the royal statistical society. Series B (methodological, pp.1-38, 1977.

A. Ene, S. Im, and B. Moseley, Fast clustering using MapReduce, Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.681-689, 2011.

. Michael-d-escobar, Estimating normal means with a Dirichlet process prior, J. Amer. Statist. Assoc, vol.89, pp.268-277, 1994.

D. Michael, M. Escobar, and . West, Bayesian density estimation and inference using mixtures, Journal of the american statistical association, vol.90, pp.577-588, 1995.

Y. Gal and Z. Ghahramani, Pitfalls in the use of parallel inference for the Dirichlet process, Proceedings of the 31st International Conference on Machine Learning, pp.208-216, 2014.

A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian Data Analysis, 2004.

J. Gonzalez, Y. Low, A. Gretton, and C. Guestrin, Parallel gibbs sampling: From colored fields to thin junction trees, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp.324-332, 2011.

V. Hodge and J. Austin, A Survey of Outlier Detection Methodologies, Artificial Intelligence Review, vol.22, pp.85-126, 2004.

G. James, D. Witten, T. Hastie, and R. Tibshirani, An introduction to statistical learning, vol.112, 2013.

A. Jasra, C. C. Holmes, and D. Stephens, Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling, Statist. Sci, pp.50-67, 2005.

D. Lovell, P. Ryan, V. K. Adams, and . Mansingka, Parallel markov chain monte carlo for dirichlet process mixtures, Workshop on Big Learning, NIPS, 2012.

C. Ma, X. Hao-helen-zhang, and . Wang, Machine learning for Big Data analytics in plants, Trends in Plant Science, vol.19, pp.798-808, 0112.

W. Jeffrey, M. Miller, and . Harrison, Inconsistency of Pitman-Yor process mixtures for the number of components, The Journal of Machine Learning Research, vol.15, pp.3333-3370, 2014.

M. Radford and . Neal, Markov chain sampling methods for Dirichlet process mixture models, Journal of computational and graphical statistics, vol.9, pp.249-265, 2000.

D. Newman, A. Asuncion, P. Smyth, and M. Welling, Distributed algorithms for topic models, Journal of Machine Learning Research, vol.10, pp.1801-1828, 2009.

I. Ordovás-pascual and J. Sánchez-almeida, A fast version of the k-means classification algorithm for astronomical applications, Astronomy & Astrophysics, vol.565, 2014.

J. Sethuraman, A constructive definition of Dirichlet priors, Statistica sinica, pp.639-650, 1994.

J. Shafer, S. Rixner, and A. L. Cox, The Hadoop distributed filesystem: Balancing portability and performance, 2010.

J. Nguyen-xuan-vinh, J. Epps, and . Bailey, Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance, J. Mach. Learn. Res, vol.11, pp.2837-2854, 2010.

R. Wang and D. Lin, Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data, Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI'17), pp.4632-4639, 2017.

S. Williamson, A. Dubey, and E. Xing, Parallel Markov chain Monte Carlo for nonparametric mixture models, International Conference on Machine Learning, pp.98-106, 2013.

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, Spark: Cluster Computing with Working Sets. In HotCloud, 2010.