A new incremental growing neural gas algorithm based on clusters labeling maximization: application to clustering of heterogeneous textual data
Résumé
Neural clustering algorithms show high performance in the usual context of the analysis of homogeneous textual dataset. This is especially true for the recent adaptive versions of these algorithms, like the incremental neural gas algorithm (IGNG). Nevertheless, this paper highlights clearly the drastic decrease of performance of these algorithms, as well as the one of more classical algorithms, when a heterogeneous textual dataset is considered as an input. A new incremental growing neural gas algorithm exploiting knowledge issued from clusters current labeling in an incremental way is proposed as an alternative to the original distance based algorithm. This solution leads to obtain very significant increase of performance for the clustering of heterogeneous textual data. Moreover, it provides a real incremental character to the proposed algorithm.