Tree-based Cost-Sensitive Methods for Fraud Detection in Imbalanced Data

Guillaume Metzler; Xavier Badiche; Brahim Belkasmi; Elisa Fromont; Amaury Habrard; Marc Sebban

doi:10.1007/978-3-030-01768-2_18

Communication Dans Un Congrès Année : 2018

Tree-based Cost-Sensitive Methods for Fraud Detection in Imbalanced Data

(1) , (2) , (2) , (3) , (1) , (1)

1
2
3

Guillaume Metzler

Fonction : Auteur
PersonId : 740506
IdHAL : guillaume-metzler

Laboratoire Hubert Curien

Xavier Badiche

Fonction : Auteur
PersonId : 1037464

Blitz Business Service

Brahim Belkasmi

Fonction : Auteur
PersonId : 1037465

Blitz Business Service

Elisa Fromont

Fonction : Auteur
PersonId : 9985
IdHAL : efromont
ORCID : 0000-0003-0133-3491
IdRef : 095621601

Large Scale Collaborative Data Mining

Amaury Habrard

Fonction : Auteur
PersonId : 439
IdHAL : amaury-habrard
ORCID : 0000-0003-3038-9347
IdRef : 084103655

Laboratoire Hubert Curien

Marc Sebban

Fonction : Auteur
PersonId : 5203
IdHAL : marc-sebban
ORCID : 0000-0001-6851-169X
IdRef : 050802623

Laboratoire Hubert Curien

Résumé

Bank fraud detection is a difficult classification problem where the number of frauds is much smaller than the number of genuine transactions. In this paper, we present cost sensitive tree-based learning strategies applied in this context of highly imbalanced data. We first propose a cost sensitive splitting criterion for decision trees that takes into account the cost of each transaction and we extend it with a decision rule for classification with tree ensembles. We then propose a new cost-sensitive loss for gradient boosting. Both methods have been shown to be particularly relevant in the context of imbalanced data. Experiments on a proprietary dataset of bank fraud detection in retail transactions show that our cost sensitive algorithms allow to increase the retailer's benefits by 1,43% compared to non cost-sensitive ones and that the gradient boosting approach outperforms all its competitors.

Domaines

Intelligence artificielle [cs.AI] Apprentissage [cs.LG]

Fichier principal

CSTree.pdf (359.29 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Guillaume METZLER : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01895967

Soumis le : vendredi 9 novembre 2018-09:54:17

Dernière modification le : vendredi 24 mars 2023-14:53:08

Archivage à long terme le : dimanche 10 février 2019-13:15:05

Dates et versions

hal-01895967 , version 1 (09-11-2018)

Identifiants

HAL Id : hal-01895967 , version 1
DOI : 10.1007/978-3-030-01768-2_18

Citer

Guillaume Metzler, Xavier Badiche, Brahim Belkasmi, Elisa Fromont, Amaury Habrard, et al.. Tree-based Cost-Sensitive Methods for Fraud Detection in Imbalanced Data. IDA 2018 - 17th International Symposium on Intelligent Data Analysis, Oct 2018, ‘s-Hertogenbosch, Netherlands. pp.213-224, ⟨10.1007/978-3-030-01768-2_18⟩. ⟨hal-01895967⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-ST-ETIENNE IOGS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA PARISTECH CENTRALESUPELEC INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UDL UR1-MATH-NUM

203 Consultations

658 Téléchargements

Tree-based Cost-Sensitive Methods for Fraud Detection in Imbalanced Data

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager