Active Learning for Semi-Supervised K-Means Clustering - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

Active Learning for Semi-Supervised K-Means Clustering

Résumé

K-Means algorithm is one of the most used clustering algorithm for Knowledge Discovery in Data Mining. Seed based K-Means is the integration of a small set of labeled data (called seeds) to the K-Means algorithm to improve its performances and overcome its sensitivity to initial centers. These centers are, most of the time, generated at random or they are assumed to be available for each cluster. This paper introduces a new efficient algorithm for active seeds selection which relies on a Min-Max approach that favors the coverage of the whole dataset. Experiments conducted on artificial and real datasets show that, using our active seeds selection algorithm, each cluster contains at least one seed after a very small number of queries and thus helps reducing the number of iterations until convergence which is crucial in many KDD applications.
Fichier non déposé

Dates et versions

hal-01292094 , version 1 (22-03-2016)

Identifiants

Citer

Viet Vu Vu, Nicolas Labroche, Bernadette Bouchon-Meunier. Active Learning for Semi-Supervised K-Means Clustering. The 22th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-2010), Oct 2010, Arras, France. pp.12-15, ⟨10.1109/ICTAI.2010.11⟩. ⟨hal-01292094⟩
101 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More