Skip to Main content Skip to Navigation
Conference papers

Sélection de variables par le GLM-Lasso pour la prédiction du risque palustre

Abstract : In this study, we propose an automatic learning method for variables selection based on Lasso in epidemiology context. One of the aim of this approach is to overcome the pretreatment of experts in medicine and epidemiology on collected data. These pretreatment consist in recoding some variables and to choose some interactions based on expertise. The approach proposed uses all available explanatory variables without treatment and generate automatically all interactions between them. This lead to high dimension. We use Lasso, one of the robust methods of variable selection in high dimension. To avoid over fitting a two levels cross-validation is used. Because the target variable is account variable and the lasso estimators are biased, variables selected by lasso are debiased by a GLM and used to predict the distribution of the main vector of malaria which is Anopheles. Results show that only few climatic and environmental variables are the mains factors associated to the malaria risk exposure.
Complete list of metadatas

Cited literature [7 references]  Display  Hide  Download
Contributor : Fabrice Rossi <>
Submitted on : Wednesday, September 9, 2015 - 5:52:02 PM
Last modification on : Sunday, January 19, 2020 - 6:38:32 PM
Long-term archiving on: : Monday, December 28, 2015 - 11:23:51 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License


  • HAL Id : hal-01196450, version 1
  • ARXIV : 1509.02873



Bienvenue Kouwayè, Noël Fonton, Fabrice Rossi. Sélection de variables par le GLM-Lasso pour la prédiction du risque palustre. 47èmes Journées de Statistique de la SFdS, Société Française de Statistique, Jun 2015, Lille, France. ⟨hal-01196450⟩



Record views


Files downloads