F. Cesarini, M. Lastri, S. Marinai, and G. Soda, Encoding of modified XY trees for document classification, icdar, p.1131, 2001.

N. Chen and D. Blostein, A survey of document image classification: problem statement, classifier architecture and performance evaluation, International Journal of Document Analysis and Recognition (IJDAR), vol.18, issue.6, pp.1-16, 2007.
DOI : 10.1007/s10032-006-0020-2

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, Least angle regression, Annals of statistics, vol.32, issue.2, pp.407-451, 2004.

C. Fraley and A. E. Raftery, Enhanced Model-Based Clustering, Density Estimation, and Discriminant Analysis Software: MCLUST, Journal of Classification, vol.20, issue.2, pp.263-286, 2003.
DOI : 10.1007/s00357-003-0015-3

H. Frigui and R. Krishnapuram, A robust competitive clustering algorithm with applications in computer vision. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.21, issue.5, pp.450-465, 1999.

N. Grira, M. Crucianu, and N. Boujemaa, Active semi-supervised fuzzy clustering, Pattern Recognition, vol.41, issue.5, pp.1834-1844, 2008.
DOI : 10.1016/j.patcog.2007.10.004

M. Halkidi, M. Vazirgiannis, and Y. Batistakis, Quality Scheme Assessment in the Clustering Process, Principles of Data Mining and Knowledge Discovery, pp.265-276, 2000.
DOI : 10.1007/3-540-45372-5_26

L. Kaufman and P. Rousseeuw, Finding groups in data : an introduction to cluster analysis, 1990.
DOI : 10.1002/9780470316801

A. Oliveira-brochado and F. V. Martins, Assessing the number of components in mixture models : a review, 2005.

K. Pollard and M. J. Van-der-laan, A method to identify significant clusters in gene expression data, Invited Proceedings of Sci2002, pp.318-325, 2002.

P. J. Rousseeuw, Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, Journal of Computational and Applied Mathematics, vol.20, pp.53-65, 1987.
DOI : 10.1016/0377-0427(87)90125-7

E. Saund, Scientific challenges underlying production document processing, Document Recognition and Retrieval XVIII, 2011.
DOI : 10.1117/12.876948

C. Shin, D. Doermann, and A. Rosenfeld, Classification of document pages using structure-based features, International Journal on Document Analysis and Recognition, vol.3, issue.4, pp.232-247, 2001.
DOI : 10.1007/PL00013566

Q. Zhao, V. Hautamaki, and P. Franti, Knee Point Detection in BIC for Detecting the Number of Clusters, Advanced Concepts for Intelligent Vision Systems, pp.664-673, 2008.
DOI : 10.1007/s100440070007

X. S. Zhou and T. S. Huang, Relevance feedback in image retrieval : A comprehensive review. Multimedia systems, pp.536-544, 2003.