An Efficient New PDE-based Characters Reconstruction After Graphics Removal

Louisa Kessi 1 Frank Le Bourgeois 1 Christophe Garcia 1
1 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : The separation between texts and graphics when they are overlapped is a challenging problem for digitization companies. In a previous work [1], we presented the first unsupervised fully automatic segmentation system adapted for colour business document with significant colour complexity and dithered background. The system achieves several operations to segment automatically colour images, separate text from noise and graphics and provides colour information about text colour. After split overlapped characters and separates characters from graphics, characters are broken. The OCR system becomes unable to recognize successfully broken characters and its efficiency is thus seriously affected. This paper presents the first Character Reconstruction System through a new PDE (Partial Differential Equation)-based approach. Our approach takes benefit of the combination of the anisotropic morphology proposed by Breuß and the Weickert Coherence enhancing shock filter diffusion. We introduce and present a continuous anisotropic morphology method driven by the main direction of the first order tensors applied in the neighborhood of the missing part left by the separation between text and graphics. It reconstructs the missing part even when the left area is larger than the strokes width. The coherency of the orientation of the tensors around missing parts overcomes the problem of image noises. The application of the ABBY FineReader OCR engine proves an important reduction in OCR errors. Our experiments show that our proposition compared to the existing state of the art requires no training steps and outperforms both of anisotropic morphology and the Weickert Coherence enhancing shock filter diffusion applied separately.
Document type :
Conference papers
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01345831
Contributor : Louisa Kessi <>
Submitted on : Friday, September 23, 2016 - 4:51:21 PM
Last modification on : Tuesday, February 26, 2019 - 11:20:48 AM

File

ICFHR 2016-final submission.pd...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01345831, version 1

Citation

Louisa Kessi, Frank Le Bourgeois, Christophe Garcia. An Efficient New PDE-based Characters Reconstruction After Graphics Removal. 15th International Conference on Frontiers in Handwriting Recognition (ICFHR-2016) , Oct 2016, Shenzhen, China. ⟨hal-01345831⟩

Share

Metrics

Record views

391

Files downloads

323