Inexact graph matching for entity recognition in OCRed documents - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

Inexact graph matching for entity recognition in OCRed documents

Résumé

This paper proposes an entity recognition system in image documents recognized by OCR. The system is based on a graph matching technique and is guided by a database describing the entities in its records. The input of the system is a document which is labeled by the entity attributes. A first grouping of those labels based on a function score leads to a selected set of candidate entities. The entity labels which are locally close are modeled by a structure graph. This graph is matched with model graphs learned for this purpose. The graph matching technique relies on a specific cost function that integrates the feature dissimilarities. The matching results are exploited to correct the mislabeling errors and then validate the entity recognition task. The system evaluation on three datasets which treat different kind of entities shows a variation between 88.3% and 95% for recall and 94.3% and 95.7% for precision.
Fichier principal
Vignette du fichier
ICPR_2016.pdf (851.59 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01515412 , version 1 (27-04-2017)

Identifiants

Citer

Nihel Kooli, Abdel Belaid. Inexact graph matching for entity recognition in OCRed documents. ICPR, Dec 2016, Mexico, Mexico. pp.4071 - 4076, ⟨10.1109/ICPR.2016.7900271⟩. ⟨hal-01515412⟩
210 Consultations
156 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More