Malware Detection in PDF Files Using Machine Learning

Abstract : We present how we used machine learning techniques to detect malicious behaviours in PDF files. At this aim, we first set up a SVM (Support Machine Vector) classifier that was able to detect 99.7% of malware. However, this classifier was easy to lure with malicious PDF files, which we forged to make them look like clean ones. For instance, we implemented a gradient-descent attack to evade this SVM. This attack was almost 100% successful. Next, we provided counter-measures to this attack: a more elaborated features selection and the use of a threshold allowed us to stop up to 99.99% of this attack. Finally, using adversarial learning techniques, we were able to prevent gradient-descent attacks by iteratively feeding the SVM with malicious forged PDF files. We found that after 3 iterations, every gradient-descent forged PDF file were detected, completely preventing the attack.
Type de document :
Communication dans un congrès
SECRYPT 2018 - 15th International Conference on Security and Cryptography, Jul 2018, Porto, Portugal. 8p., 2018
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01704766
Contributeur : Mathieu Valois <>
Soumis le : lundi 20 août 2018 - 11:32:46
Dernière modification le : jeudi 15 novembre 2018 - 11:58:57

Fichier

Malware Detection in PDF Files...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01704766, version 2

Citation

Bonan Cuan, Aliénor Damien, Claire Delaplace, Mathieu Valois. Malware Detection in PDF Files Using Machine Learning. SECRYPT 2018 - 15th International Conference on Security and Cryptography, Jul 2018, Porto, Portugal. 8p., 2018. 〈hal-01704766v2〉

Partager

Métriques

Consultations de la notice

551

Téléchargements de fichiers

539