Skip to Main content Skip to Navigation
Conference papers

Privacy-Preserving Synthetic Educational Data Generation

Jill-Jênn Vie 1 Tomas Rigaux 1 Sein Minn 2 
2 CEDAR - Rich Data Analytics at Cloud Scale
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
Abstract : Institutions collect massive learning traces but they may not disclose it for privacy issues. Synthetic data generation opens new opportunities for research in education. In this paper we present a generative model for educational data that can preserve the privacy of participants, and an evaluation framework for comparing synthetic data generators. We show how naive pseudonymization can lead to re-identification threats and suggest techniques to guarantee privacy. We evaluate our method on existing massive educational open datasets.
Complete list of metadata

https://hal.inria.fr/hal-03715416
Contributor : Jill-Jênn Vie Connect in order to contact the contributor
Submitted on : Wednesday, July 6, 2022 - 1:41:10 PM
Last modification on : Saturday, July 9, 2022 - 3:32:56 AM

Files

EC_TEL_2022_paper_83_Vie.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03715416, version 1
  • ARXIV : 2207.03202

Citation

Jill-Jênn Vie, Tomas Rigaux, Sein Minn. Privacy-Preserving Synthetic Educational Data Generation. EC-TEL 2022, Sep 2022, Toulouse, France. ⟨hal-03715416⟩

Share

Metrics

Record views

1

Files downloads

0