Semantic Label and Structure Model based Approach for Entity Recognition in Database Context - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Semantic Label and Structure Model based Approach for Entity Recognition in Database Context

Résumé

—This paper proposes an entity recognition approach in scanned documents referring to their description in database records. First, using the database record values, the corresponding document fields are labeled. Second, entities are identified by their labels and ranked using a TF/IDF based score. For each entity, local labels are grouped into a graph. This graph is matched with a graph model (structure model) which represents geometric structures of local entity labels using a specific cost function. This model is trained on a set of well chosen entities semi-automatically annotated. At the end, a correction step allows us to complete the eventual entity mislabeling using geometrical relationships between labels. The evaluation on 200 business documents containing 500 entities reaches about 93% for recall and 97% for precision.
Fichier principal
Vignette du fichier
ICDAR.pdf (711.13 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01191425 , version 1 (01-09-2015)

Identifiants

  • HAL Id : hal-01191425 , version 1

Citer

Nihel Kooli, Abdel Belaïd. Semantic Label and Structure Model based Approach for Entity Recognition in Database Context. 13th International Conference on Document Analysis and Recognition (ICDAR 2015), Aug 2015, Nancy, France. pp.5. ⟨hal-01191425⟩
95 Consultations
325 Téléchargements

Partager

Gmail Facebook X LinkedIn More