%0 Journal Article %T Textual data summarization using the Self-Organized Co-Clustering model %+ Entrepôts, Représentation et Ingénierie des Connaissances (ERIC) %+ MOdel for Data Analysis and Learning (MODAL) %A Selosse, Margot %A Jacques, Julien %A Biernacki, Christophe %< avec comité de lecture %@ 0031-3203 %J Pattern Recognition %I Elsevier %V 103 %P 107315 %8 2020-02 %D 2020 %R 10.1016/j.patcog.2020.107315 %K coclustering %K Latent Block Model %K document-term matrix %Z Mathematics [math]/Statistics [math.ST]Journal articles %X Recently, different studies have demonstrated the use of co-clustering, a data mining technique which simultaneously produces row-clusters of observations and column-clusters of features. The present work introduces a novel co-clustering model to easily summarize textual data in a document-term format. In addition to highlighting homogeneous co-clusters as other existing algorithms do we also distinguish noisy co-clusters from significant co-clusters, which is particularly useful for sparse document-term matrices. Furthermore, our model proposes a structure among the significant co-clusters, thus providing improved interpretability to users. The approach proposed contends with state-of-the-art methods for document and term clustering and offers user-friendly results. The model relies on the Poisson distribution and on a constrained version of the Latent Block Model, which is a probabilistic approach for co-clustering. A Stochastic Expectation-Maximization algorithm is proposed to run the model’s inference as well as a model selection criterion to choose the number of coclusters. Both simulated and real data sets illustrate the eciency of this model by its ability to easily identify relevant co-clusters. %G English %2 https://hal.science/hal-02115294v3/document %2 https://hal.science/hal-02115294v3/file/manuscript.pdf %L hal-02115294 %U https://hal.science/hal-02115294 %~ CNRS %~ INRIA %~ UNIV-LYON1 %~ UNIV-LYON2 %~ INRIA-LILLE %~ INSMI %~ ERIC %~ INRIA_TEST %~ LORIA2 %~ TESTALAIN1 %~ INRIA2 %~ UNIV-LILLE %~ LYON2 %~ UDL %~ UNIV-LYON %~ INRIAARTDOI %~ LPP-MATH %~ AILYS