Tree-based Cost-Sensitive Methods for Fraud Detection in Imbalanced Data - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Tree-based Cost-Sensitive Methods for Fraud Detection in Imbalanced Data

Guillaume Metzler
Xavier Badiche
  • Fonction : Auteur
  • PersonId : 1037464
Brahim Belkasmi
  • Fonction : Auteur
  • PersonId : 1037465
Elisa Fromont
Amaury Habrard
Marc Sebban

Résumé

Bank fraud detection is a difficult classification problem where the number of frauds is much smaller than the number of genuine transactions. In this paper, we present cost sensitive tree-based learning strategies applied in this context of highly imbalanced data. We first propose a cost sensitive splitting criterion for decision trees that takes into account the cost of each transaction and we extend it with a decision rule for classification with tree ensembles. We then propose a new cost-sensitive loss for gradient boosting. Both methods have been shown to be particularly relevant in the context of imbalanced data. Experiments on a proprietary dataset of bank fraud detection in retail transactions show that our cost sensitive algorithms allow to increase the retailer's benefits by 1,43% compared to non cost-sensitive ones and that the gradient boosting approach outperforms all its competitors.
Fichier principal
Vignette du fichier
CSTree.pdf (359.29 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01895967 , version 1 (09-11-2018)

Identifiants

Citer

Guillaume Metzler, Xavier Badiche, Brahim Belkasmi, Elisa Fromont, Amaury Habrard, et al.. Tree-based Cost-Sensitive Methods for Fraud Detection in Imbalanced Data. IDA 2018 - 17th International Symposium on Intelligent Data Analysis, Oct 2018, ‘s-Hertogenbosch, Netherlands. pp.213-224, ⟨10.1007/978-3-030-01768-2_18⟩. ⟨hal-01895967⟩
203 Consultations
658 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More