Decontamination of Mutually Contaminated Models

Gilles Blanchard; Clayton Scott

Communication Dans Un Congrès Année : 2014

Decontamination of Mutually Contaminated Models

(1) , (2)

1
2

Gilles Blanchard

Fonction : Auteur
PersonId : 738034
IdHAL : gilles-blanchard
ORCID : 0000-0003-2125-933X
IdRef : 190973250

Institut für Mathematik [Potsdam]

Clayton Scott

Fonction : Auteur

University of Michigan [Ann Arbor]

Résumé

A variety of machine learning problems are characterized by data sets that are drawn from multiple different convex combinations of a fixed set of base distributions. We call this a mutual contamination model. In such problems, it is often of interest to recover these base distributions, or otherwise discern their properties. This work focuses on the problem of classification with multiclass label noise, in a general setting where the noise proportions are unknown and the true class distributions are nonseparable and potentially quite complex. We develop a procedure for decontamination of the contaminated models from data, which then facilitates the design of a consistent discrimination rule. Our approach relies on a novel method for estimating the error when projecting one distribution onto a convex combination of others, where the projection is with respect to a statistical distance known as the separation distance. Under sufficient conditions on the amount of noise and purity of the base distributions, this projection procedure successfully recovers the underlying class distributions. Connections to novelty detection, topic modeling, and other learning problems are also discussed.

Domaines

Intelligence artificielle [cs.AI] Apprentissage [cs.LG] Statistiques [math.ST]

Fichier principal

blanchard14-supp-pdfjam.pdf (523.51 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Gilles Blanchard : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03371264

Soumis le : vendredi 8 octobre 2021-14:41:55

Dernière modification le : vendredi 19 avril 2024-15:27:26

Archivage à long terme le : dimanche 9 janvier 2022-19:51:20

Dates et versions

hal-03371264 , version 1 (08-10-2021)

Licence

Paternité

Identifiants

HAL Id : hal-03371264 , version 1

Citer

Gilles Blanchard, Clayton Scott. Decontamination of Mutually Contaminated Models. Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS 2014), 2014, Reykjavik, Iceland. pp.1-9. ⟨hal-03371264⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

37 Consultations

8 Téléchargements

Decontamination of Mutually Contaminated Models

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Partager