Skip to Main content Skip to Navigation
Conference papers

Statistical Methods and Credit Scoring

Gilbert Saporta 1
1 CEDRIC - MSDMA - CEDRIC. Méthodes statistiques de data-mining et apprentissage
CEDRIC - Centre d'études et de recherche en informatique et communications
Abstract : Basel 2 regulations brought new interest in supervised classification methodologies for predicting default probability for loans. Density estimation , neural networks, non linear SVM provide direct estimates of default probability but are not widely used because of the lack of interpretability. Logistic regression and linear discriminant analysis are the most frequently used techniques for they provide easy-to-use scorecards based on additive partial scores. We will compare these two major techniques, which are sometimes unduly opposed. Since posterior probabilities depend on priors, we will address the case of stratified sampling. An important feature of consumer credit is that predictors are generally categorical. Vapnik's statistical learning theory explains why a prior dimension reduction (eg by means of multiple correspondence analysis) improves the robustness of the score function. Default probabilities may be computed directly, or by means of a score function. Since a probability is also a score, almost all classification methods (including classification trees), may be compared thanks to ROC analysis, which is more informative than the simple misclassification rate. Survival analysis brings new perspectives, especially for long-term loans, for the prediction of "when" instead of "if" a default occurrs.
Document type :
Conference papers
Complete list of metadata
Contributor : Laboratoire Cedric <>
Submitted on : Sunday, December 13, 2020 - 12:16:45 PM
Last modification on : Tuesday, December 15, 2020 - 11:18:10 AM


  • HAL Id : hal-01125170, version 1



Gilbert Saporta. Statistical Methods and Credit Scoring. JOCLAD'06, Apr 2006, Lisbonne, Portugal. ⟨hal-01125170⟩



Record views


Files downloads