Learning to Detect Tables in Scanned Document Images using Line Information - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Learning to Detect Tables in Scanned Document Images using Line Information

Clément Chatelain
Thierry Paquet

Résumé

This paper presents a method to detect table regions in document images by identifying the column and row line-separators and their properties. The method employs a run-length approach to identify the horizontal and vertical lines present in the input image. From each group of intersecting horizontal and vertical lines, a set of 26 low-level features are extracted and an SVM classifier is used to test if it belongs to a table or not. The performance of the method is evaluated on a heterogeneous corpus of French, English and Arabic documents that contain various types of table structures and compared with that of the Tesseract OCR system.
Fichier principal
Vignette du fichier
ICDAR_table_detection.pdf (2.49 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00934902 , version 1 (22-01-2014)

Identifiants

Citer

Thotreingam Kasar, Philippine Barlas, Adam Sébastien, Clément Chatelain, Thierry Paquet. Learning to Detect Tables in Scanned Document Images using Line Information. Internation conférence on document analysis and recognition, Aug 2013, Washington, United States. pp.1185 - 1189, ⟨10.1109/ICDAR.2013.240⟩. ⟨hal-00934902⟩
120 Consultations
2925 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More