Is-ClusterMPP: clustering algorithm through point processes and influence space towards high-dimensional data - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Advances in Data Analysis and Classification Année : 2019

Is-ClusterMPP: clustering algorithm through point processes and influence space towards high-dimensional data

Résumé

Clustering via Marked Point Processes and Influence Space, Is-ClusterMPP, is a new unsupervised clustering algorithm through adaptive MCMC sampling of a Marked point processes of interacting balls. The chosen Gibbs energy cost function makes use of k-influence space information. It detects clusters of different shapes, sizes and unbalanced local densities. It aims at dealing also with high-dimensional and scalable datasets. Is-ClusterMPP solves the problem of local heterogeneity in densities and prevents the impact of the global density in the detection of unbalanced classes, by using the k-influence space. This concept reduces also the input values amount. The curse of dimensionality is handled by using a local subspace clustering principal embedded in a weighted similarity metric. Balls are constituting a configuration sampled from the Marked point process. Due to the choice of the energy, they tends to cover neighboring data, then considered sharing the same cluster set. The energy is balancing different goals. (1) The data driven objective function is provided according to k-influence space. Data in a high-dense region are favored to be covered by a ball. (2) An interaction part in the energy prevents the balls full overlap phenomenon and favors connected groups of balls. The algorithm, Markov dynamics, does converge towards configurations sampled from the MPP model. This algorithm has been applied in real benchmarks through gene expression data of various sizes. Different experiments have been done to compare Is-ClusterMPP against the most well-known clustering algorithms and its efficiency is claimed.
Fichier non déposé

Dates et versions

hal-02077905 , version 1 (24-03-2019)

Identifiants

Citer

Khadidja Henni, Pierre-Yves Louis, Brigitte Vannier, Ahmed Moussa. Is-ClusterMPP: clustering algorithm through point processes and influence space towards high-dimensional data. Advances in Data Analysis and Classification, 2019, ⟨10.1007/s11634-019-00379-2⟩. ⟨hal-02077905⟩

Collections

CNRS UNIV-POITIERS
89 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More