Skip to Main content Skip to Navigation
Conference papers

Missing Data in Hierarchical Classification of Variables, a simulation study

Abstract : Here we develop from a first work the effect of missing data in hierarchical classification of variables according to the following factors: amount of missing data, imputation techniques, similarity coefficient, and aggregation criterion. We have used two methods of imputation, a regression method using an OLS method and an EM algorithm. For the similarity matrices we have used the basic affinity coefficient and the Pearson's correlation coefficient. As aggregation criteria we apply average linkage, single linkage and complete linkage methods. To compare the structure of the hierarchical classifications the Spearman's coefficient between the associated ultrametrics has been used. We present here simulation experiments in five multivariate normal cases.
Document type :
Conference papers
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download
Contributor : Laboratoire Cedric <>
Submitted on : Thursday, March 26, 2020 - 5:02:49 PM
Last modification on : Monday, March 30, 2020 - 11:48:10 AM


Files produced by the author(s)


  • HAL Id : hal-01124802, version 1



Ana Lorga da Silva, Helena Bacelar-Nicolau, Gilbert Saporta. Missing Data in Hierarchical Classification of Variables, a simulation study. IFCS 2002, International Federation of Classification Societies, Jul 2002, Krakow, Poland. pp.121-128. ⟨hal-01124802⟩



Record views


Files downloads