Component-based regularisation of a multivariate GLM with a thematic partitioning of the explanatory variables
Résumé
We address component-based regularisation of a multivariate Gener-alised Linear Model (GLM). A set of random responses Y is assumed to depend, through a GLM, on a set X of explanatory variables, as well as on a set A of addi-2 Xavier Bry et al. tional covariates. X is partitioned into R conceptually homogenous variable groups X 1 ,. . ., X R , viewed as explanatory themes. Variables in each X r are assumed many and redundant. Thus, generalised linear regression demands dimension-reduction and regularisation with respect to each X r. By contrast, variables in A are assumed few and selected so as to demand no regularisation. Regularisation is performed searching each X r for an appropriate number of orthogonal components that both contribute to model Y and capture relevant structural information in X r. To estimate a single-theme model, we first propose an enhanced version of Supervised Component Generalised Linear Regression (SCGLR), based on a flexible measure of structural relevance of components, and able to deal with mixed-type explanatory variables. Then, to estimate the multiple-theme model, we develop an algorithm encapsulating this enhanced SCGLR: THEME-SCGLR. The method is tested on simulated data, and then applied to rainforest data in order to model the abundance of tree-species.
Origine : Fichiers produits par l'(les) auteur(s)
Loading...