Evolutionary Subspace Clustering Using Variable Genome Length

Sergio Peignier 1 Christophe Rigotti 2, 3, 4 Guillaume Beslon 2
2 BEAGLE - Artificial Evolution and Computational Biology
LBBE - Laboratoire de Biométrie et Biologie Evolutive - UMR 5558, Inria Grenoble - Rhône-Alpes, LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
3 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : Subspace clustering is a data mining task that groups similar data objects and at the same time searches the subspaces where similarities appear. For this reason, subspace clustering is recognized as more general and complicated than standard clustering. In this paper , we present ChameleoClust + , a bio-inspired evolutionary subspace clustering algorithm that takes advantage of an evolvable genome structure to detect various numbers of clusters located in different subspaces. ChameleoClust + incorporates several bio-like features such as a variable genome length, both functional and non-functional elements and mutation operators including large rearrangements. It was assessed and compared to the state-of-the-art methods on a reference benchmark using both real world and synthetic datasets. While other algorithms may need complex parameter settings, ChameleoClust + needs to set only one subspace clustering ad-hoc and intuitive parameter: the maximal number of clusters. The remaining parameters of ChameleoClust + are related to the evolution strategy (e.g., population size, mutation rate) and a single setting for all of them turned out to be effective for all the benchmark datasets. A sensitivity analysis has also been carried out to study the impact of each parameter on the subspace clustering quality.
Document type :
Journal articles
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02405598
Contributor : Christophe Rigotti <>
Submitted on : Wednesday, December 11, 2019 - 5:41:33 PM
Last modification on : Tuesday, December 17, 2019 - 2:27:32 AM

File

chameleoclust+_ComptIntell_pre...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02405598, version 1

Citation

Sergio Peignier, Christophe Rigotti, Guillaume Beslon. Evolutionary Subspace Clustering Using Variable Genome Length. Computational Intelligence, Wiley, In press, pp.1-39. ⟨hal-02405598⟩

Share

Metrics

Record views

24

Files downloads

28