Skip to Main content Skip to Navigation
Conference papers

Some Statistical Aspects of Credit Scoring

Gilbert Saporta 1
1 CEDRIC - MSDMA - CEDRIC. Méthodes statistiques de data-mining et apprentissage
CEDRIC - Centre d'études et de recherche en informatique et communications
Abstract : Basel 2 regulations brought new interest in supervised classification methodologies for predicting default probability for loans. Default probabilities may be computed directly, or by means of a score function. An important feature of consumer credit is that predictors are generally categorical. Logistic regression and linear discriminant analysis are the most frequently used techniques for they provide easy-to-use scorecards based on additive partial scores. Vapnik's statistical learning theory explains why a prior dimension reduction (eg by means of multiple correspondence analysis) improves the robustness of the score function. Ridge regression, linear SVM, PLS regression are also valuable competitors. Density estimation , neural networks, non linear SVM provide direct estimates of default probability but are not so widely used because of the lack of interpretability. Since a probability is also a score, almost all classification methods (including classification trees), may be compared with ROC analysis, which is more informative than the simple misclassification rate. AUC, Gini's index are related to the well known non-parametric Wilcoxon-Mann-Whitney test. Some experiments on real data will be presented. Distinguish between good and bad customers is not enough, especially for long-term loans. The question is then not only "if", but "when" the customers default. Survival analysis provides new types of scores, but their performance are far more difficult to measure.
Document type :
Conference papers
Complete list of metadata
Contributor : Laboratoire Cedric <>
Submitted on : Friday, December 11, 2020 - 12:36:04 PM
Last modification on : Wednesday, March 24, 2021 - 11:51:34 AM



  • HAL Id : hal-01125140, version 1



Gilbert Saporta. Some Statistical Aspects of Credit Scoring. 3rd world Conf. on Computational Statistics Data Analysis, Oct 2005, Limassol, Cyprus. ⟨hal-01125140⟩



Record views


Files downloads