Learning from aggregated data with a maximum entropy model

Alexandre Gilotte; Ahmed Ben Yamed; David Rohde

Pré-Publication, Document De Travail Année : 2022

Learning from aggregated data with a maximum entropy model

(1) , (1) , (1)

Alexandre Gilotte

Fonction : Auteur

Criteo AI Lab

Ahmed Ben Yamed

Fonction : Auteur
PersonId : 1161754

Criteo AI Lab

David Rohde

Fonction : Auteur
PersonId : 1053680

Criteo AI Lab

Résumé

Aggregating a dataset, then injecting some noise, is a simple and common way to release differentially private data. However, aggregated data -even without noise- is not an appropriate input for machine learning classifiers. In this work, we show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. The resulting model is a Markov Random Field (MRF), and we detail how to apply, modify and scale a MRF training algorithm to our setting. Finally we present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

aggregates.pdf (594.24 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Rohde David : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03770740

Soumis le : mardi 4 octobre 2022-13:08:02

Dernière modification le : mercredi 5 octobre 2022-16:36:37

Dates et versions

hal-03770740 , version 1 (06-09-2022)

hal-03770740 , version 2 (04-10-2022)

Identifiants

HAL Id : hal-03770740 , version 2
ARXIV : 3796293

Citer

Alexandre Gilotte, Ahmed Ben Yamed, David Rohde. Learning from aggregated data with a maximum entropy model. 2022. ⟨hal-03770740v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

35 Consultations

122 Téléchargements

Learning from aggregated data with a maximum entropy model

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager