Publishing Differentially Private Medical Events Data

Abstract : Sequential data has been widely collected in the past few years; in the public health domain it appears as collections of medical events such as lab results, electronic chart records, or hospitalization transactions. Publicly available sequential datasets for research purposes promises new insights, such as understanding patient types, and recognizing emerging diseases. Unfortunately, the publication of sequential data presents a significant threat to users’ privacy. Since data owners prefer to avoid such risks, much of the collected data is currently unavailable to researchers. Existing anonymization techniques that aim at preserving sequential patterns lack two important features: handling long sequences and preserving occurrence times. In this paper, we address this challenge by employing an ensemble of Markovian models trained based on the source data. The ensemble takes several optional periodicity levels into consideration. Each model captures transitions between times and states according to shorter parts of the sequence, which is eventually reconstructed. Anonymity is provided by utilizing only elements of the model that guarantee differential privacy. Furthermore, we develop a solution for generating differentially private sequential data, which will bring us one step closer to publicly available medical datasets via sequential data. We applied this method to two real medical events datasets and received some encouraging results, demonstrating that the proposed method can be used to publish high quality anonymized data.
Type de document :
Communication dans un congrès
Francesco Buccafurri; Andreas Holzinger; Peter Kieseberg; A Min Tjoa; Edgar Weippl. International Conference on Availability, Reliability, and Security (CD-ARES), Aug 2016, Salzburg, Austria. Springer International Publishing, Lecture Notes in Computer Science, LNCS-9817, pp.219-235, 2016, Availability, Reliability, and Security in Information Systems. 〈10.1007/978-3-319-45507-5_15〉
Liste complète des métadonnées

Littérature citée [12 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01635024
Contributeur : Hal Ifip <>
Soumis le : mardi 14 novembre 2017 - 16:07:20
Dernière modification le : mercredi 15 novembre 2017 - 01:15:12

Fichier

 Accès restreint
Fichier visible le : 2019-01-01

Connectez-vous pour demander l'accès au fichier

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Sigal Shaked, Lior Rokach. Publishing Differentially Private Medical Events Data. Francesco Buccafurri; Andreas Holzinger; Peter Kieseberg; A Min Tjoa; Edgar Weippl. International Conference on Availability, Reliability, and Security (CD-ARES), Aug 2016, Salzburg, Austria. Springer International Publishing, Lecture Notes in Computer Science, LNCS-9817, pp.219-235, 2016, Availability, Reliability, and Security in Information Systems. 〈10.1007/978-3-319-45507-5_15〉. 〈hal-01635024〉

Partager

Métriques

Consultations de la notice

19