PerioClust: a simple hierarchical agglomerative clustering approach including constraints - Archive ouverte HAL Accéder directement au contenu
Chapitre D'ouvrage Année : 2020

PerioClust: a simple hierarchical agglomerative clustering approach including constraints

Résumé

PerioClust is a hierarchical agglomerative clustering (HAC) method including temporal (resp. spatial) ordering constraints. This new semi-supervised learning algorithm is designed to consider two potentially error-prone sources of information associated with the same observations. One reflects dissimilarities in the ”feature space” and the other the temporal (resp. spatial) constraint structure between the observations. A distance-based approach is adopted to modify the distance measure in the classical HAC algorithm using a convex combination to take into account the two initial dissimilarity matrices. The choice of the mixing parameter is therefore the key point.We define a criterion based on cophenetic distances, as well as a resampling procedure to ensure the good robustness of the proposed clustering method. The dendrogram associated with this HAC can be interpreted as the result of a compromise between each source of information analysed separately. We illustrate our clustering method on two real data sets: (i) an archaeological one containing temporal information, (ii) a socio-economical one containing geographical information.
Fichier non déposé

Dates et versions

hal-02952538 , version 1 (29-09-2020)

Identifiants

  • HAL Id : hal-02952538 , version 1

Citer

Lise Bellanger, Arthur Coulon, Philippe Husi. PerioClust: a simple hierarchical agglomerative clustering approach including constraints. Springer. Data Analysis, and Rationality in a Complex World, XXIII, 2020, Springer Series “Studies in Classification, Data Analysis and knowledge Organization", 978-3-030-60104-1. ⟨hal-02952538⟩
95 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More