C-mix: a high dimensional mixture model for censored durations, with applications to genetic data

We introduce a supervised learning mixture model for censored durations (C-mix) to simultaneously detect subgroups of patients with different prognosis and order them based on their risk. Our method is applicable in a high-dimensional setting, i.e. with a large number of biomedical covariates. Indeed, we penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model and automatically pinpoints the relevant covariates for the survival prediction. Inference is achieved using an efficient Quasi-Newton Expectation Maximization (QNEM) algorithm, for which we provide convergence properties. The statistical performance of the method is examined on an extensive Monte Carlo simulation study, and finally illustrated on three publicly available genetic cancer datasets with high-dimensional co-variates. We show that our approach outperforms the state-of-the-art survival models in this context, namely both the CURE and Cox proportional hazards models penalized by the Elastic-Net, in terms of C-index, AUC(t) and survival prediction. Thus, we propose a powerfull tool for personalized medicine in cancerology.

Mots clés

Coxs proportional hazards model CURE model Elastic-net reg- ularization High-dimensional estimation Mixture duration model Survival analysis

Domaines

Machine Learning [stat.ML]

Fichier principal

bggj.pdf (2.04 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Simon Bussy : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01648389

Soumis le : samedi 25 novembre 2017-18:50:13

Dernière modification le : vendredi 19 avril 2024-16:18:58

Dates et versions

hal-01648389 , version 1 (25-11-2017)

Identifiants

HAL Id : hal-01648389 , version 1

Citer

Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot. C-mix: a high dimensional mixture model for censored durations, with applications to genetic data. Statistical Methods in Medical Research, 2019, 28 (5), pp.1523--1539. ⟨hal-01648389⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSERM UNIV-PARIS7 X EPHE CNRS UNIV-EVRY INRA X-CMAP X-DEP-MATHA CMAP CORDELIERS LSTA PSL USPC LAMME UNIV-PARIS-SACLAY LPSM SORBONNE-UNIVERSITE SU-SCIENCES INRAE UP-SANTE UP-SCIENCES GS-ENGINEERING MATHNUM

415 Consultations

229 Téléchargements