Partial data querying through racing algorithms

Abstract : The paper studies the problem of actively learning from instances characterized by imprecise features or imprecise class labels, where by actively learning we understand the possibility to query the precise value of imprecisely specified data. We differ from classical active learning by the fact that in the later, data are either fully precise or completely missing, while in our case they can be partially specified. Such situations can appear when sensor errors are important to encode, or when experts have only specified a subset of possible labels when tagging data. We provide a general active learning technique that can be applied in principle to any model. It is inspired from racing algorithms, in which several models are competing against each others. The main idea of our method is to identify the query that will be the most helpful in identifying the winning model in the competition. After discussing and formalizing the general ideas of our approach, we illustrate it by studying the particular case of binary SVM in the case of interval valued features and set-valued labels. The experimental results indicate that, in comparison to other baselines, racing algorithms provide a faster reduction of the uncertainty in the learning process, especially in the case of imprecise features.
Document type :
Journal articles
Complete list of metadatas

Cited literature [7 references]  Display  Hide  Download
Contributor : Sébastien Destercke <>
Submitted on : Monday, April 30, 2018 - 12:57:17 PM
Last modification on : Wednesday, April 10, 2019 - 9:07:46 AM
Long-term archiving on : Tuesday, September 25, 2018 - 7:20:31 AM


Files produced by the author(s)




Vu-Linh Nguyen, Sébastien Destercke, Marie-Hélène Masson. Partial data querying through racing algorithms. International Journal of Approximate Reasoning, Elsevier, 2018, 96, pp.36-55. ⟨10.1016/j.ijar.2018.03.005⟩. ⟨hal-01781455⟩



Record views


Files downloads