Correcting a Class of Complete Selection Bias with External Data Based on Importance Weight Estimation

Van-Tinh Tran 1 Alex Aussem 1
1 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : We present a practical bias correction method for classifier and regression models learning under a general class of selection bias. The method hinges on two assumptions: 1) a feature vector, Xs, exists such that S, the variable that controls the inclusion of the samples in the training set, is conditionally independent of (X, Y) given Xs; 2) one has access to some external samples drawn from the population as a whole in order to approximate the unbiased distribution of Xs. This general framework includes covariate shift and prior probability shift as special cases. We first show how importance weighting can remove this bias. We also discuss the case where our key assumption about Xs is not valid and where XS is only partially observed in the test set. Experimental results on synthetic and real-world data demonstrate that our method works well in practice.
Document type :
Conference papers
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01247394
Contributor : Van-Tinh Tran <>
Submitted on : Tuesday, December 22, 2015 - 3:22:37 PM
Last modification on : Thursday, November 21, 2019 - 2:29:44 AM
Long-term archiving on : Wednesday, March 23, 2016 - 2:04:19 PM

File

Selection_bias_ICONIP2015.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives 4.0 International License

Identifiers

Citation

Van-Tinh Tran, Alex Aussem. Correcting a Class of Complete Selection Bias with External Data Based on Importance Weight Estimation. 22nd International Conference, ICONIP 2015, Nov 2015, Istanbul, Turkey. ⟨10.1007/978-3-319-26555-1_13⟩. ⟨hal-01247394⟩

Share

Metrics

Record views

781

Files downloads

340