Learning from aggregated data with a maximum entropy model - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2022

Learning from aggregated data with a maximum entropy model

Alexandre Gilotte
  • Fonction : Auteur
Ahmed Ben Yamed
  • Fonction : Auteur
  • PersonId : 1161754
David Rohde
  • Fonction : Auteur
  • PersonId : 1053680

Résumé

Aggregating a dataset, then injecting some noise, is a simple and common way to release differentially private data. However, aggregated data -even without noise- is not an appropriate input for machine learning classifiers. In this work, we show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. The resulting model is a Markov Random Field (MRF), and we detail how to apply, modify and scale a MRF training algorithm to our setting. Finally we present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
Fichier principal
Vignette du fichier
sample.pdf (472.03 Ko) Télécharger le fichier
JMLR Learning From Aggregates.zip (392.64 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03770740 , version 1 (06-09-2022)
hal-03770740 , version 2 (04-10-2022)

Identifiants

  • HAL Id : hal-03770740 , version 1

Citer

Alexandre Gilotte, Ahmed Ben Yamed, David Rohde. Learning from aggregated data with a maximum entropy model. 2022. ⟨hal-03770740v1⟩
35 Consultations
123 Téléchargements

Partager

Gmail Facebook X LinkedIn More