Adaptative Smart-Binarization Method for Images of Business Documents

Djamel Gaceb 1 Frank Le Bourgeois 1 Jean Duong 1
1 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : The automatic reading systems of business documents requires fast and accurate reading of interest zones using the OCR technology. The result quality of the binarization has a major impact on the quality of binary characters. We propose in this paper a smart-binarization method of the images of business documents. In our work, we considered different degradations on document images, real-time constraints and high spatial resolution of the images. The quality of each pixel is estimated using a hierarchical local thresholding in order to classify it as foreground, background or ambiguous pixel. The ambiguous pixels that represent the degraded zones cannot be binarized with the same local thresholding. The global quality of the image is thus estimated from the density of theses degraded pixels. If it is considered as degraded, we apply a second separation on the ambiguous pixels to separate them into background or foreground. This second process uses our improved relaxation method that we have accelerate for the first time to integrate it into a system of automatic reading document. Our approach, compared to existing binarization approaches (local or global), offers a better reading of characters by the OCR. The computation time remains constant with the variation of the local window size through the use of integral images. The method was developed in the context of DOD project (Documents On Demand) at the request of the ITESOFT company.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01339207
Contributor : Équipe Gestionnaire Des Publications Si Liris <>
Submitted on : Wednesday, June 29, 2016 - 3:48:51 PM
Last modification on : Wednesday, October 31, 2018 - 12:24:25 PM

Identifiers

  • HAL Id : hal-01339207, version 1

Citation

Djamel Gaceb, Frank Le Bourgeois, Jean Duong. Adaptative Smart-Binarization Method for Images of Business Documents. Twelfth International Conference on Document Analysis and Recognition (ICDAR 2013), Aug 2013, Washington, USA, United States. pp.118-122. ⟨hal-01339207⟩

Share

Metrics

Record views

184