Skip to Main content Skip to Navigation
Conference papers

Missing Data in Hierarchical Classification of Variables, a simulation study

Abstract : Here we develop from a first work the effect of missing data in hierarchical classification of variables according to the following factors: amount of missing data, imputation techniques, similarity coefficient, and aggregation criterion. We have used two methods of imputation, a regression method using an OLS method and an EM algorithm. For the similarity matrices we have used the basic affinity coefficient and the Pearson's correlation coefficient. As aggregation criteria we apply average linkage, single linkage and complete linkage methods. To compare the structure of the hierarchical classifications the Spearman's coefficient between the associated ultrametrics has been used. We present here simulation experiments in five multivariate normal cases.
Complete list of metadatas

Cited literature [13 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01124802
Contributor : Laboratoire Cedric <>
Submitted on : Thursday, March 26, 2020 - 5:02:49 PM
Last modification on : Monday, March 30, 2020 - 11:48:10 AM

File

RC481.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01124802, version 1

Collections

Citation

Ana Lorga da Silva, Helena Bacelar-Nicolau, Gilbert Saporta. Missing Data in Hierarchical Classification of Variables, a simulation study. IFCS 2002, International Federation of Classification Societies, Jul 2002, Krakow, Poland. pp.121-128. ⟨hal-01124802⟩

Share

Metrics

Record views

177

Files downloads

32