A Practical Approach to Reduce the Learning Bias Under Covariate Shift

Van-Tinh Tran 1 Alex Aussem 1
1 DM2L - Data Mining and Machine Learning
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : Covariate shift is a specific class of selection bias that arises when the marginal distributions of the input features X are different in the source and the target domains while the conditional distributions of the target Y given X are the same. A common technique to deal with this problem, called importance weighting, amounts to reweighting the training instances in order to make them resemble the test distribution. However this usually comes at the expense of a reduction of the effective sample size. In this paper, we show analytically that, while the unweighted model is globally more biased than the weighted one, it may locally be less biased on low importance instances. In view of this result, we then discuss a manner to optimally combine the weighted and the unweighted models in order to improve the predictive performance in the target domain. We conduct a series of experiments on synthetic and real-world data to demonstrate the efficiency of this approach.
Document type :
Conference papers
Complete list of metadatas

Cited literature [12 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01213965
Contributor : Van-Tinh Tran <>
Submitted on : Tuesday, October 13, 2015 - 2:38:53 PM
Last modification on : Wednesday, November 20, 2019 - 2:55:46 AM
Long-term archiving on: Thursday, April 27, 2017 - 12:09:10 AM

File

Covariate_shift_ECML2015.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution - NonCommercial 4.0 International License

Identifiers

Citation

Van-Tinh Tran, Alex Aussem. A Practical Approach to Reduce the Learning Bias Under Covariate Shift. ECML PKDD 2015 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2015, Porto, Portugal. pp 71-86, ⟨10.1007/978-3-319-23525-7_5⟩. ⟨hal-01213965⟩

Share

Metrics

Record views

458

Files downloads

585