Robustness of character recognition techniques to double print-and-scan process - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Robustness of character recognition techniques to double print-and-scan process

Résumé

The integrity check of printed and scanned documents is a hot topic these days. Several solutions were proposed for documents printed and scanned once. However forged documents quite often pass through a double Print-and-Scan (P&S) process. The P&S process impacts a lot the shape and color of characters. Therefore, the top OCR Engines cannot correctly recognize these characters. In this paper, we present the problems that the Tesseract OCR Engine faces with when trying to recognize the characters printed and scanned twice. We suggest to use the PCA based character recognition method that outperforms the Tesseract OCR in our experiments. We also show that the use of a pre-procesing step can improve the recognition results of double printed and scanned documents. Finally, we discuss the pros and cons of the PCA based recognition method.
Fichier principal
Vignette du fichier
IWCDF_006.pdf (220.69 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01900029 , version 1 (20-10-2018)

Identifiants

  • HAL Id : hal-01900029 , version 1

Citer

Iuliia Tkachenko, Petra Gomez-Krämer. Robustness of character recognition techniques to double print-and-scan process. First International Workshop on Computational Document Forensics, Nov 2017, Kyoto, Japan. ⟨hal-01900029⟩

Collections

L3I UNIV-ROCHELLE
216 Consultations
640 Téléchargements

Partager

Gmail Facebook X LinkedIn More