Learning to Detect Tables in Scanned Document Images using Line Information - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2013

Learning to Detect Tables in Scanned Document Images using Line Information

Résumé

This paper presents a method to detect table regions in document images by identifying the column and row line separators and their properties. The method employs a runlength approach to identify the horizontal and vertical lines present in the input image. From each group of intersecting horizontal and vertical lines, a set of 26 low-level features are extracted and an SVM classifier is used to test if it belongs to a table or not. The performance of the method is evaluated on a heterogeneous corpus of French, English and Arabic documents that contain various types of table structures and compared with that of the Tesseract OCR system.
Fichier principal
Vignette du fichier
kasar_icdar2013.pdf (2.87 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00905546 , version 1 (19-11-2013)

Identifiants

  • HAL Id : hal-00905546 , version 1

Citer

Thotreingam Kasar, Philippine Barlas, Sebastien Adam, Clément Chatelain, Thierry Paquet. Learning to Detect Tables in Scanned Document Images using Line Information. ICDAR, 2013, France. pp.1185-1189. ⟨hal-00905546⟩
142 Consultations
904 Téléchargements

Partager

Gmail Facebook X LinkedIn More