Non-Disjoint Clustered Representation for Distributions over a Population of Cells (poster)

Matthieu Pichené; Sucheendra K. Palaniappan; Eric Fabre; Blaise Genest

Communication Dans Un Congrès Année : 2017

Non-Disjoint Clustered Representation for Distributions over a Population of Cells (poster)

(1) , (1) , (1) , (1)

Matthieu Pichené

Fonction : Auteur

SUpervision of large MOdular and distributed systems

Sucheendra K. Palaniappan

Fonction : Auteur

SUpervision of large MOdular and distributed systems

Eric Fabre

Fonction : Auteur
PersonId : 170693
IdHAL : eric-fabre
ORCID : 0000-0001-9008-8234
IdRef : 103579648

SUpervision of large MOdular and distributed systems

Blaise Genest

Fonction : Auteur
PersonId : 864401

SUpervision of large MOdular and distributed systems

Résumé

We consider a large homogenous population of cells, where each cell is governed by the same complex biological pathway. A good modeling of the inherent variability of biological species is of crucial importance to the understanding of how the population evolves. In this work, we handle this variability by considering multivariate distributions, where each species is a random variable. Usually, the number of species in a pathway-and thus the number of variables-is high. This appealing approach thus quickly faces the curse of dimensionality: representing exactly the distribution of a large number of variables is intractable. To make this approach tractable, we explore different techniques to approximate the original joint distribution by meaningful and tractable ones. The idea is to consider families of joint probability distributions on large sets of random variables that admit a compact representation, and then select within this family the one that best approximates the desired intractable one. Natural measures of approximation accuracy can be derived from information theory. We compare several representations over distributions of populations of cells obtained from several fine-grained models of pathways (e.g. ODEs). We also explore the interest of such approximate distributions for approximate inference algorithms [1, 2] for coarse-grained abstractions of biological pathways [3]. 2 Results Our approximation scheme is to drop most correlations between variables. Indeed , when many variables are conditionally independent, the multivariate distribution can be compactly represented. The key is to keep the most relevant correlations, evaluated using the mutual information (MI) between two variables. The simplest approximation is called fully factored (FF), and assumes that all the variables are independent. It leads to very compact representation and fast computations, but it also leads to fairly inaccurate results as correlations between variables are entirely lost, even for highly correlated species (MI = 0.6). Alternately, one can preserve a few of the strongest correlations, selected using MI, giving rise to a set of disjoint clusters of variables. For efficiency reason, we used clusters of size two. This model was able to capture some of the most significant correlations between pairs of variables (representing around 30% of the total MI), but dropped significant ones (MI = 0.2).

Domaines

Autre [cs.OH]

Fichier principal

PPFG17.pdf (134.24 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Blaise Genest : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01625665

Soumis le : samedi 28 octobre 2017-09:59:35

Dernière modification le : vendredi 24 mars 2023-14:53:05

Archivage à long terme le : lundi 29 janvier 2018-14:40:39

Dates et versions

hal-01625665 , version 1 (28-10-2017)

Identifiants

HAL Id : hal-01625665 , version 1

Citer

Matthieu Pichené, Sucheendra K. Palaniappan, Eric Fabre, Blaise Genest. Non-Disjoint Clustered Representation for Distributions over a Population of Cells (poster). CMSB 2017, 2017, Darmstadt, Germany. pp.324-326. ⟨hal-01625665⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

316 Consultations

66 Téléchargements

Non-Disjoint Clustered Representation for Distributions over a Population of Cells (poster)

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager