Extraction of correlated gene clusters from multiple genomic data by generalized kernel canonical correlation analysis.
Résumé
MOTIVATION: A major issue in computational biology is the reconstruction of pathways from several genomic datasets, such as expression data, protein interaction data and phylogenetic profiles. As a first step toward this goal, it is important to investigate the amount of correlation which exists between these data. RESULTS: These methods are successfully tested on their ability to recognize operons in the Escherichia coli genome, from the comparison of three datasets corresponding to functional relationships between genes in metabolic pathways, geometrical relationships along the chromosome, and co-expression relationships as observed by gene expression data.