Constrained co-clustering of gene expression data.

Ruggero Pensa 1 Jean-François Boulicaut 1
1 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : In many applications, the expert interpretation of co-clustering is easier than for mono-dimensional clustering. Co-clustering aims at computing a bi-partition that is a collection of co-clusters: each co-cluster is a group of objects associated to a group of attributes and these associations can support interpretations. Many constrained clustering algorithms have been proposed to exploit the domain knowledge and to improve partition relevancy in the mono-dimensional case (e.g., using the so-called must-link and cannot-link constraints). Here, we consider constrained co-clustering not only for extended must-link and cannot-link constraints (i.e., both objects and attributes can be involved), but also for interval constraints that enforce properties of co-clusters when considering ordered domains. We propose an iterative co-clustering algorithm which exploits user-defined constraints while minimizing the sum-squared residues, i.e., an objective function introduced for gene expression data clustering by Cho et al (2004). We illustrate the added value of our approach in two applications that concern gene expression data analysis.
Type de document :
Communication dans un congrès
SIAM International Conference on Data Mining SDM'08, Apr 2008, Atlanta, United States. pp.25-36, 2008
Liste complète des métadonnées
Contributeur : Équipe Gestionnaire Des Publications Si Liris <>
Soumis le : lundi 3 avril 2017 - 14:55:44
Dernière modification le : vendredi 11 janvier 2019 - 16:29:24


  • HAL Id : hal-01500611, version 1


Ruggero Pensa, Jean-François Boulicaut. Constrained co-clustering of gene expression data.. SIAM International Conference on Data Mining SDM'08, Apr 2008, Atlanta, United States. pp.25-36, 2008. 〈hal-01500611〉



Consultations de la notice