Model Selection for Gaussian Latent Block Clustering with the Integrated Classification Likelihood
Résumé
For a given data table, several candidate models are usually examined, which differ for example in the number of clusters. Model selection then becomes a critical issue. To this end, we develop a criterion based on an approximation of the Integrated Classification Likelihood for the Gaussian latent block model, and propose a BIC-like variant \yg{following the same pattern}. We also propose a non-asymptotic exact criterion, thus circumventing the controversial definition of the asymptotic regime arising from the dual nature of the rows and columns \yg{in co-clustering}. The experimental results show steady performances of these criteria for medium to large data tables.