A variant of Gaussian mixture models to cluster censored individuals
Résumé
The French Observatory of Indoor Air Quality performed measurements of the indoor air concentrations of 20 volatile organic compounds in 567 dwellings over one week each, between 2003 and 2005. We propose to use classical clustering methods based on Gaussian mixture models and the EM algorithm to define profiles of indoor pollution. However, some values of the concentrations are missing due to a failure of the measurement devices, or left-censored due to their sensitivity (two thresholds). We adapt these methods to process censored data by using the truncated Gaussian distribution.