Performance/cost analysis of a cloud based solution for big data analytic: Application in intrusion detection - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Performance/cost analysis of a cloud based solution for big data analytic: Application in intrusion detection

Résumé

The essential target of ‘Big Data’ technology is to provide new techniques and tools to assimilate and store large amount of generated data in a way to analyze and process it to get insights and predictions that can offer new opportunities towards the improvement of our life in different domains. In this context, ‘Big Data’ treats two essential issues: the real-time analysis issue introduced by the increasing velocity at which data is generated, and the long-term analysis issue introduced by the huge volume of stored data. To deal with these two issues, we propose in this paper a Cloud-based solution for big data analytic on Amazon Cloud operator. Our objective is to evaluate the performance of Big Data services offered regarding the volume/velocity of the processed data. The dataset we use contains information about”network connections” in approximately 5 million records with 41 features; the solution works as a network intrusion detector. It receives data records in real time from a raspberry pi node and predicts if the connection is bad (malicious intrusion or attack) or good (normal connection). The prediction model was made using a logistic regression network. We evaluate the cloud resources needed to train the machine learning model (batch processing), and to predict the new streaming data with the trained network in real time (real time processing). The solution worked very well with high accuracy and the results show that when working with Big Data in the cloud, we are mainly dealing with a cost/performance trade-off, the processing performance in term of response time for both long-term and real-time analysis can be always guaranteed once the cloud resources are well provisioned according to the needs.
Fichier non déposé

Dates et versions

hal-02155795 , version 1 (13-06-2019)

Identifiants

  • HAL Id : hal-02155795 , version 1

Citer

Nada Chendeb Taher, Imane Mallat, Nazim Agoulmine, Nour El-Mawass. Performance/cost analysis of a cloud based solution for big data analytic: Application in intrusion detection. 1st International Conference on Big Data and Cyber-Security Intelligence (BDCSIntell 2018), Dec 2018, Beirut, Lebanon. pp.34--41. ⟨hal-02155795⟩
52 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More