Segmentation Of Broken Characters Using Pattern Matching

Loris Eynard 1 Frank Le Bourgeois 1 Hubert Emptoz 1
1 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : Nowadays the research on OCR system focuses on corrupted and damaged characters from printed and handwritten documents. Many researches have been done on touching charac-ters but only few on broken characters. This paper presents a new method to reconstruct printed characters extracted as many connected components. Our approach is based on the pat-tern similarity between broken characters and perfect ones from the same printed document. In the first step, we use a multi-segmentation algorithm to extract all possible connected compo-nents from a document image digitized in grayscale, and then we order them by their size. The correctly segmented characters are supposed to be bigger than the parts of miss-recognized ones. We compute a similarity measure between all connected components, in decreasing order of their size. Then we localize the broken characters by using the bounding box of the correct pattern which have the best match.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01593294
Contributor : Équipe Gestionnaire Des Publications Si Liris <>
Submitted on : Tuesday, September 26, 2017 - 9:46:18 AM
Last modification on : Wednesday, October 31, 2018 - 12:24:25 PM

Identifiers

  • HAL Id : hal-01593294, version 1

Citation

Loris Eynard, Frank Le Bourgeois, Hubert Emptoz. Segmentation Of Broken Characters Using Pattern Matching. 9th International Conference on Pattern Recognition and Information Processing, PRIP'2007, May 2007, Minsk, Belarus. pp.101-107. ⟨hal-01593294⟩

Share

Metrics

Record views

96