Skip to Main content Skip to Navigation
Journal articles

Learning Aspect Models with Partially Labeled Data

Abstract : In this paper, we address the problem of learning aspect models with partially labeled data for the task of document categorization. The motivation of this work is to take advantage of the amount of available unlabeled data together with the set of labeled examples to learn latent models whose structure and underlying hypotheses take more accurately into accountthe document generation processm compared to other mixture-based generative models. We present one semi-supervised variant of the PLSA model. In our approach, we try to capture the possible data mislabeling errors which occur during the training of our model. This is done by iteratively assigning class labels to document collections, as well as over a real world dataset coming from a Business Group of Xerox and show the effectiveness of our approach compared to a semi-supervised version of Naive Bayes, another semi-supervised version of PLSA and to transductive Support Vector Machines.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01172498
Contributor : Lip6 Publications <>
Submitted on : Tuesday, July 7, 2015 - 2:47:14 PM
Last modification on : Thursday, March 21, 2019 - 1:10:58 PM

Identifiers

Citation

Anastasia Krithara, Massih-Reza Amini, Cyril Goutte, Jean-Michel Renders. Learning Aspect Models with Partially Labeled Data. Pattern Recognition Letters, Elsevier, 2011, 32 (2), pp.297-304. ⟨10.1016/j.patrec.2010.09.004⟩. ⟨hal-01172498⟩

Share

Metrics

Record views

145