OUTLIERS EMPHASIS ON CLUSTER ANALYSIS The use of squared Euclidean distance and fuzzy clustering to detect outliers in a dataset - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2014

OUTLIERS EMPHASIS ON CLUSTER ANALYSIS The use of squared Euclidean distance and fuzzy clustering to detect outliers in a dataset

Gianluca Rosso
  • Fonction : Auteur
  • PersonId : 954155

Résumé

Outlier is the term that indicates in statistics an anomalous observation, aberrant, clearly distant from others collected observations. The outliers are the subject to animated discussions in various contexts with regard to be or not to be considered in the average evaluations. Outliers can become a precious source of information, on condition that be able to accurately identify the presence in the reference datasets. The need to identify the presence of clustered outliers in a dataset not previously treated could argue for a fuzzy clustering, emphasized by using the quadratic Euclidean distance as similarity measure. For interesting and useful results, it should be inclined a possibilistic clustering approach, where the term "possibilistic" means, always in mathematical rigor, a component of interpretation of values that point out anomalous cases. The crisp method does not allow it, the fuzzy method introduce it, the possibilistic one use it. This is a very simple paper with divulgative purposes, addressed especially to students, but not only.
Fichier principal
Vignette du fichier
Outliers_on_clustering.pdf (268.1 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00962248 , version 1 (20-03-2014)

Identifiants

  • HAL Id : hal-00962248 , version 1

Citer

Gianluca Rosso. OUTLIERS EMPHASIS ON CLUSTER ANALYSIS The use of squared Euclidean distance and fuzzy clustering to detect outliers in a dataset. 2014. ⟨hal-00962248⟩
86 Consultations
166 Téléchargements

Partager

Gmail Facebook X LinkedIn More