HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Dataset shift quantification for credit card fraud detection

Abstract : Machine learning and data mining techniques have been used extensively in order to detect credit card frauds. However purchase behaviour and fraudster strategies may change over time. This phenomenon is named dataset shift or concept drift in the domain of fraud detection. In this paper, we present a method to quantify day-by-day the dataset shift in our face-to-face credit card transactions dataset (card holder located in the shop) . In practice, we classify the days against each other and measure the efficiency of the classification. The more efficient the classification, the more different the buying behaviour between two days, and vice versa. Therefore, we obtain a distance matrix characterizing the dataset shift. After an agglomerative clustering of the distance matrix, we observe that the dataset shift pattern matches the calendar events for this time period (holidays, week-ends, etc). We then incorporate this dataset shift knowledge in the credit card fraud detection task as a new feature. This leads to a small improvement of the detection.
Document type :
Conference papers
Complete list of metadata

Contributor : Pierre-Edouard Portier Connect in order to contact the contributor
Submitted on : Tuesday, July 9, 2019 - 3:06:15 PM
Last modification on : Tuesday, June 1, 2021 - 2:08:07 PM

Links full text


  • HAL Id : hal-02178042, version 1
  • ARXIV : 1906.06977


Yvan Lucas, Pierre-Edouard Portier, Léa Laporte, Sylvie Calabretto, Liyun He-Guelton, et al.. Dataset shift quantification for credit card fraud detection. AIKE IEEE International Conference on Artificial Intelligence and Knowledge Engineering, Jun 2019, Cagliari, Italy. ⟨hal-02178042⟩



Record views