Efficient and Privacy-Preserving k-Means Clustering for Big Data Mining

Zakaria Gheid; Yacine Challal

doi:10.1109/TrustCom.2016.0140

Communication Dans Un Congrès Année : 2016

Efficient and Privacy-Preserving k-Means Clustering for Big Data Mining

(1) , (1, 2)

1
2

Zakaria Gheid

Fonction : Auteur

Laboratoire de Méthodes de Conception de Systèmes

Yacine Challal

Fonction : Auteur
PersonId : 8362
IdHAL : yacine-challal
ORCID : 0000-0002-9237-6210
IdRef : 090236165

Laboratoire de Méthodes de Conception de Systèmes

Centre de recherche sur l'Information Scientifique et Technique

Résumé

—Recent advances in sensing and storing technologies have led to big data age where a huge amount of data are distributed across sites to be stored and analysed. Indeed, cluster analysis is one of the data mining tasks that aims to discover patterns and knowledge through different algorithmic techniques such as k-means. Nevertheless, running k-means over distributed big data stores has given rise to serious privacy issues. Accordingly, many proposed works attempted to tackle this concern using cryptographic protocols. However, these cryptographic solutions introduced performance degradation issues in analysis tasks which does not meet big data properties. In this work we propose a novel privacy-preserving k-means algorithm based on a simple yet secure and efficient multi-party additive scheme that is cryptography-free. We designed our solution for horizontally partitioned data. Moreover, we demonstrate that our scheme resists against adversaries passive model.

Mots clés

horizontally partitioned data k-means clustering privacy efficiency

Domaines

Réseaux et télécommunications [cs.NI]

Fichier principal

k-means.pdf (396.49 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Yacine Challal : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01466904

Soumis le : lundi 13 février 2017-18:38:23

Dernière modification le : jeudi 4 avril 2019-10:18:07

Archivage à long terme le : dimanche 14 mai 2017-16:51:15

Dates et versions

hal-01466904 , version 1 (13-02-2017)

Identifiants

HAL Id : hal-01466904 , version 1
DOI : 10.1109/TrustCom.2016.0140

Citer

Zakaria Gheid, Yacine Challal. Efficient and Privacy-Preserving k-Means Clustering for Big Data Mining. IEEE TristCom, Aug 2016, Tianjin, China. pp.791 - 798, ⟨10.1109/TrustCom.2016.0140⟩. ⟨hal-01466904⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

61 Consultations

727 Téléchargements

Efficient and Privacy-Preserving k-Means Clustering for Big Data Mining

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager