Co-Clustering of Multivariate Functional Data for the Analysis of Air Pollution in the South of France - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Annals of Applied Statistics Année : 2022

Co-Clustering of Multivariate Functional Data for the Analysis of Air Pollution in the South of France

Résumé

Nowadays, air pollution is a major treat for public health, with clear links with many diseases, especially cardiovascular ones. The spatio-temporal study of pollution is of great interest for governments and local authorities when deciding for public alerts or new city policies against pollution raise. The aim of this work is to study spatio-temporal profiles of environmental data collected in the south of France (Région Sud) by the public agency AtmoSud. The idea is to better understand the exposition to pollutants of inhabitants on a large territory with important differences in term of geography and urbanism. The data gather the recording of daily measurements of five environmental variables, namely three pollutants (PM10, NO2, O3) and two meteorological factors (pressure and temperature) over six years. Those data can be seen as multivariate functional data: quantitative entities evolving along time, for which there is a growing need of methods to summarize and understand them. For this purpose, a novel co-clustering model for multivariate functional data is defined. The model is based on a functional latent block model which assumes for each co-cluster a probabilistic distribution for multivariate functional principal component scores. A Stochastic EM algorithm, embedding a Gibbs sampler, is proposed for model inference, as well as a model selection criteria for choosing the number of co-clusters. The application of the proposed co-clustering algorithm on environmental data of the Région Sud allowed to divide the region composed by 357 zones in six macro-areas with common exposure to pollution. We showed that pollution profiles vary accordingly to the seasons and the patterns are conserved during the 6 years studied. These results can be used by local authorities to develop specific programs to reduce pollution at the macro-area level and to identify specific periods of the year with high pollution peaks in order to set up specific prevention programs for health. Overall, the proposed co-clustering approach is a powerful resource to analyse multivariate functional data in order to identify intrinsic data structure and summarize variables profiles over long periods of time.
Fichier principal
Vignette du fichier
funLBMmulti-AOAS.pdf (5.19 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02862177 , version 1 (09-06-2020)
hal-02862177 , version 2 (14-09-2021)

Identifiants

Citer

Charles Bouveyron, Julien Jacques, Amandine Schmutz, Fanny Simoes, Silvia Bottini. Co-Clustering of Multivariate Functional Data for the Analysis of Air Pollution in the South of France. Annals of Applied Statistics, 2022, 16 (3), pp.1400-1422. ⟨10.1214/21-AOAS1547⟩. ⟨hal-02862177v2⟩
593 Consultations
288 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More