Generalization error bounds in semi-supervised classification under the cluster assumption - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of Machine Learning Research Année : 2007

Generalization error bounds in semi-supervised classification under the cluster assumption

Résumé

We consider semi-supervised classification when part of the available data is unlabeled. These unlabeled data can be useful for the classification problem when we make an assumption relating the behavior of the regression function to that of the marginal distribution. Seeger (2000) proposed the well-known "cluster assumption" as a reasonable one. We propose a mathematical formulation of this assumption and a method based on density level sets estimation that takes advantage of it to achieve fast rates of convergence both in the number of unlabeled examples and the number of labeled examples.
Fichier principal
Vignette du fichier
rigollet_JMLR07_rev.pdf (264.61 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-00022528 , version 1 (10-04-2006)
hal-00022528 , version 2 (22-09-2006)
hal-00022528 , version 3 (24-01-2007)
hal-00022528 , version 4 (05-07-2007)

Identifiants

Citer

Philippe Rigollet. Generalization error bounds in semi-supervised classification under the cluster assumption. Journal of Machine Learning Research, 2007, 8 (Jul), pp.1369--1392. ⟨hal-00022528v4⟩
159 Consultations
141 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More