Skip to Main content Skip to Navigation
Conference papers

Credit scoring, statistique et apprentissage

Gilbert Saporta 1
1 CEDRIC - MSDMA - CEDRIC. Méthodes statistiques de data-mining et apprentissage
CEDRIC - Centre d'études et de recherche en informatique et communications
Abstract : Basel 2 regulations brought new interest in supervised classification methodologies for predicting default probability for loans. An important feature of consumer credit is that predictors are generally categorical. Logistic regression and linear discriminant analysis are the most frequently used techniques but are often unduly opposed. Vapnik's statistical learning theory explains why a prior dimension reduction (eg by means of multiple correspondence analysis) improves the robustness of the score function. Ridge regression, linear SVM, PLS regression are also valuable competitors. Predictive capability is measured by AUC or Gini's index which are related to the well known non-parametric Wilcoxon-Mann-Whitney test. Among methodological problems, reject inference is an important one, since most samples are subject to a selection bias. There are many methods, none being satisfactory. Distinguish between good and bad customers is not enough, especially for long-term loans. The question is then not only “if”, but “when” the customers default. Survival analysis provides new types of scores.biais. La prise en compte des dossiers refusés (reject inference) donne lieu cependant à une abondante littérature, sans guère de résultats convaincants. La discrimination entre défaillants et non-défaillants n’est plus le seul objectif, surtout pour des prêts à long terme : le « quand » devient aussi important que le « si ». De nombreux travaux s’orientent actuellement vers l’utilisation de modèles de survie pour données censurées dont nous donnerons un aperçu.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01125151
Contributor : Laboratoire Cedric <>
Submitted on : Friday, March 6, 2015 - 11:00:46 AM
Last modification on : Wednesday, March 25, 2020 - 7:34:21 PM

Identifiers

  • HAL Id : hal-01125151, version 1

Collections

Citation

Gilbert Saporta. Credit scoring, statistique et apprentissage. EGC'06, Jan 2006, Lille, France. ⟨hal-01125151⟩

Share

Metrics

Record views

279