A new hybrid binarization method based on Kmeans - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

A new hybrid binarization method based on Kmeans

Résumé

The document binarization is a fundamental processing step toward Optical Character Recognition (OCR). It aims to separate the foreground text from the document background. In this article, we propose a novel binarization technique combining local and global approaches using the clustering algorithm Kmeans. The proposed Hybrid Binarization, based on Kmeans (HBK), performs a robust binarization on scanned documents. According to several experiments, we demonstrate that the HBK method improves the binarization quality while minimizing the amount of distortion. Moreover, it out-performs several well-known state of the art methods in the OCR evaluation.
Fichier principal
Vignette du fichier
ISSCP14_HBK(publie).pdf (224.72 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01305856 , version 1 (21-04-2016)

Identifiants

Citer

Mahmoud Soua, Rostom Kachouri, Mohamed Akil. A new hybrid binarization method based on Kmeans. 6th International Symposium on Communications, Control and Signal Processing (ISCCSP), May 2014, Athens, Greece. ⟨10.1109/ISCCSP.2014.6877830⟩. ⟨hal-01305856⟩
111 Consultations
492 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More