Binacox: automatic cut-points detection in high-dimensional Cox model, with applications to genetic data

Abstract : Determining significant prognostic biomarkers is of increasing importance in many areas of medicine. In order to translate a continuous biomarker into a clinical decision , it is often necessary to determine cut-points. There is so far no standard method to help evaluate how many cut-points points are optimal for a given feature in a survival analysis setting. Moreover, most existing methods are univariate, hence not well suited for high-dimensional frameworks. This paper introduces a prognostic method called Binacox to deal with the problem of detecting multiple cut-points per features in a multivariate setting where a large number of continuous features are available. It is based on the Cox model and combines one-hot encodings with the binarsity penalty. This penalty uses total-variation regularization together with an extra linear constraint to avoid collinearity between the one-hot encodings and enable feature selection. A non-asymptotic oracle inequality is established. The statistical performance of the method is then examined on an extensive Monte Carlo simulation study, and finally illustrated on three publicly available genetic cancer datasets with high-dimensional features. On this datasets, our proposed methodology significantly outperforms the state-of-the-art survival models regarding risk prediction in terms of C-index, with a computing time orders of magnitude faster. In addition, it provides powerful interpretability by automatically pinpointing significant cut-points on relevant features from a clinical point of view.
Type de document :
Pré-publication, Document de travail
Liste complète des métadonnées

Littérature citée [35 références]  Voir  Masquer  Télécharger
Contributeur : Agathe Guilloux <>
Soumis le : lundi 18 juin 2018 - 14:09:54
Dernière modification le : lundi 18 mars 2019 - 15:59:49
Document(s) archivé(s) le : mercredi 19 septembre 2018 - 15:52:56


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-01817823, version 1



Simon Bussy, Mokhtar Z. Alaya, Agathe Guilloux, Anne-Sophie Jannot. Binacox: automatic cut-points detection in high-dimensional Cox model, with applications to genetic data. 2018. 〈hal-01817823〉



Consultations de la notice


Téléchargements de fichiers