greed: An R Package for Model-Based Clustering by Greedy Maximization of the Integrated Classification Likelihood - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

greed: An R Package for Model-Based Clustering by Greedy Maximization of the Integrated Classification Likelihood

Résumé

The greed package implements the general and flexible framework of arXiv:2002.11577 for model-based clustering in the R language. Based on the direct maximization of the exact Integrated Classification Likelihood with respect to the partition, it allows jointly performing clustering and selection of the number of groups. This combinatorial problem is handled through an efficient hybrid genetic algorithm, while a final hierarchical step allows accessing coarser partitions and extract an ordering of the clusters. This methodology is applicable in a wide variety of latent variable models and, hence, can handle various data types as well as heterogeneous data. Classical models for continuous, count, categorical and graph data are implemented, and new models may be incorporated thanks to S4 class abstraction. This paper introduces the package, the design choices that guided its development and illustrates its usage on practical use-cases.

Dates et versions

hal-03656336 , version 1 (02-05-2022)

Identifiants

Citer

Etienne Côme, Nicolas Jouvin. greed: An R Package for Model-Based Clustering by Greedy Maximization of the Integrated Classification Likelihood. 2022. ⟨hal-03656336⟩
41 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More