A multi-scale seriation algorithm for clustering sparse imbalanced data: application to spike sorting

Vincent Vigneron; Hsin Chen

doi:10.1007/s10044-015-0458-2

Article Dans Une Revue Pattern Analysis and Applications Année : 2016

A multi-scale seriation algorithm for clustering sparse imbalanced data: application to spike sorting

(1) , (2)

1
2

Vincent Vigneron

Fonction : Auteur
PersonId : 1330394
IdHAL : vincent-vigneron
ORCID : 0000-0001-5917-6041
IdRef : 224567969

Informatique, Biologie Intégrative et Systèmes Complexes

Hsin Chen

Fonction : Auteur

National Tsin Hua University

Résumé

Seriation is a useful statistical method to visualize clusters in a dataset. However, as the data are noisy or unbalanced, visualizing the data structure becomes challenging. To alleviate this limitation, we introduce a novel metric based on common neighborhood to evaluate the degree of sparsity in a dataset. A pile of matrices are derived for different levels of sparsity, and the matrices are permuted by a branch-and-bound algorithm. The matrix with the best block diagonal form is then selected by a compactness criterion. The selected matrix reveals the intrinsic structure of the data by excluding noisy data or outliers. This seriation algorithm is applicable even if the number of clusters is unknown or if the clusters are imbalanced. However, if the metric introduces too much sparsity in the data, the sub-sampled groups of data could be ousted. To resolve this problem, a multi-scale approach combining different levels of sparsity is proposed. The capability of the proposed seriation method is examined both by toy problems and in the context of spike sorting.

Mots clés

Seriation Data visualization Clustering Sparsity Spike sorting

Domaines

Traitement du signal et de l'image [eess.SP]

Frédéric Davesne : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01133002

Soumis le : mercredi 18 mars 2015-12:52:03

Dernière modification le : lundi 22 avril 2024-16:23:54

Dates et versions

hal-01133002 , version 1 (18-03-2015)

Identifiants

HAL Id : hal-01133002 , version 1
DOI : 10.1007/s10044-015-0458-2

Citer

Vincent Vigneron, Hsin Chen. A multi-scale seriation algorithm for clustering sparse imbalanced data: application to spike sorting. Pattern Analysis and Applications, 2016, 19 (4), pp.885--903. ⟨10.1007/s10044-015-0458-2⟩. ⟨hal-01133002⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-EVRY IBISC IBISC-SIBI UNIV-PARIS-SACLAY GS-ENGINEERING GS-COMPUTER-SCIENCE

71 Consultations

0 Téléchargements

A multi-scale seriation algorithm for clustering sparse imbalanced data: application to spike sorting

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager