Noise-free Latent Block Model for High Dimensional Data - Laboratoire Jean Kuntzmann Access content directly
Journal Articles Data Mining and Knowledge Discovery Year : 2018

Noise-free Latent Block Model for High Dimensional Data

Abstract

Co-clustering is known to be a very powerful and efficient approach in unsupervised learning because of its ability to partition data based on both the observations and the variables of a given dataset. However, in high-dimensional context co-clustering methods may fail to provide a meaningful result due to the presence of noisy and/or irrelevant features. In this paper, we tackle this issue by proposing a novel co-clustering model which assumes the existence of a noise cluster, that contains all irrelevant features. A variational expectation-maximization (VEM)-based algorithm is derived for this task, where the automatic variable selection as well as the joint clustering of objects and variables are achieved via a Bayesian framework. Experimental results on synthetic datasets show the efficiency of our model in the context of high-dimensional noisy data. Finally, we highlight the interest of the approach on two real datasets which goal is to study genetic diversity across the world.
Fichier principal
Vignette du fichier
Laclau2017Noise.pdf (412.79 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-01685777 , version 1 (16-01-2018)
hal-01685777 , version 2 (29-10-2018)

Identifiers

Cite

Charlotte Laclau, Vincent Brault. Noise-free Latent Block Model for High Dimensional Data. Data Mining and Knowledge Discovery, inPress, ⟨10.1007/s10618-018-0597-3⟩. ⟨hal-01685777v1⟩
263 View
274 Download

Altmetric

Share

Gmail Facebook X LinkedIn More