Caradoc: a pragmatic approach to PDF parsing and validation - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

Caradoc: a pragmatic approach to PDF parsing and validation

Résumé

PDF has become a de facto standard for exchanging electronic documents, for visualization as well as for printing. However, it has also become a common delivery channel for malware, and previous work has highlighted features that lead to security issues. In our work, we focus on the structure of the format, independently from specific features. By methodically testing PDF readers against hand-crafted files, we show that the interpretation of PDF files at the structural level may cause some form of denial of service, or be ambiguous and lead to rendering inconsistencies among readers. We then propose a pragmatic solution by restricting the syntax to avoid common errors, and propose a formal grammar for it. We explain how data consistency can be validated at a finer-grained level using a dedicated type checker. Finally, we assess this approach on a set of real-world files and show that our proposals are realistic.
Fichier principal
Vignette du fichier
document.pdf (239.56 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02143706 , version 1 (29-05-2019)

Identifiants

  • HAL Id : hal-02143706 , version 1

Citer

Guillaume Endignoux, Olivier Levillain, Jean-Yves Migeon. Caradoc: a pragmatic approach to PDF parsing and validation. 2016 IEEE Security and Privacy Workshops (SPW), May 2016, San Jose, France. pp.126-139. ⟨hal-02143706⟩
44 Consultations
1177 Téléchargements

Partager

Gmail Facebook X LinkedIn More