Learning Classification with Both Labeled and Unlabeled Data

Abstract : A key difficulty for applying machine learning classification algorithms for many applications is that they require a lot of hand-labeled examples. Labeling large amount of data is a costly process which in many cases is prohibitive. In this paper we show how the use of a small number of labeled data together with a large number of unlabeled data can create high-accuracy classifiers. Our approach does not rely on any parametric assumptions about the data as it is usually the case with generative methods widely used in semi-supervised learning. We propose new discriminant algorithms handling both labeled and unlabeled data for training classification models and we analyze their performances on different information access problems ranging from text span classification for text summarization to e-mail spam detection and text classification.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02182470
Contributor : Jean-Noël Vittaut <>
Submitted on : Friday, July 12, 2019 - 5:29:09 PM
Last modification on : Tuesday, July 16, 2019 - 10:59:20 AM

Links full text

Identifiers

Citation

Jean-Noël Vittaut, Massih-Reza Amini, Patrick Gallinari. Learning Classification with Both Labeled and Unlabeled Data. Machine Learning: ECML 2002, 13th European Conference on Machine Learning, Aug 2002, Helsinki, Finland. pp.468-479, ⟨10.1007/3-540-36755-1_39⟩. ⟨hal-02182470⟩

Share

Metrics

Record views

14