Skip to Main content Skip to Navigation
Conference papers

Inexact graph matching for entity recognition in OCRed documents

Nihel Kooli 1 Abdel Belaid 1
1 READ - Recognition of writing and analysis of documents
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This paper proposes an entity recognition system in image documents recognized by OCR. The system is based on a graph matching technique and is guided by a database describing the entities in its records. The input of the system is a document which is labeled by the entity attributes. A first grouping of those labels based on a function score leads to a selected set of candidate entities. The entity labels which are locally close are modeled by a structure graph. This graph is matched with model graphs learned for this purpose. The graph matching technique relies on a specific cost function that integrates the feature dissimilarities. The matching results are exploited to correct the mislabeling errors and then validate the entity recognition task. The system evaluation on three datasets which treat different kind of entities shows a variation between 88.3% and 95% for recall and 94.3% and 95.7% for precision.
Complete list of metadata

Cited literature [16 references]  Display  Hide  Download

https://hal.inria.fr/hal-01515412
Contributor : Nihel Kooli <>
Submitted on : Thursday, April 27, 2017 - 5:25:47 PM
Last modification on : Friday, January 15, 2021 - 5:42:02 PM
Long-term archiving on: : Friday, July 28, 2017 - 1:07:52 PM

File

ICPR_2016.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Nihel Kooli, Abdel Belaid. Inexact graph matching for entity recognition in OCRed documents. ICPR, Dec 2016, Mexico, Mexico. pp.4071 - 4076, ⟨10.1109/ICPR.2016.7900271⟩. ⟨hal-01515412⟩

Share

Metrics

Record views

361

Files downloads

316