An Information Extraction model for unconstrained handwritten documents - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

An Information Extraction model for unconstrained handwritten documents

Résumé

In this paper, a new information extraction system by statistical shallow parsing in unconstrained handwritten documents is introduced. Unlike classical approaches found in the literature as keyword spotting or full document recognition, our approch relies on a strong and powerful global handwriting model. A entire text line is considered as an indivisible entity and is modeled with Hidden Markov Models. In this way, text line shallow parsing allows fast extraction of the relevant information in any document while rejecting at the same time irrelevant information. First results are promising and show the interest of the approach.
Fichier principal
Vignette du fichier
ICPR10_thomas_chatelain_heutte_paquet.pdf (212.39 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00486107 , version 1 (25-05-2010)

Identifiants

  • HAL Id : hal-00486107 , version 1

Citer

Simon Thomas, Clement Chatelain, Laurent Heutte, Thierry Paquet. An Information Extraction model for unconstrained handwritten documents. ICPR, Istanbul, Turkey, Aug 2010, France. pp.4. ⟨hal-00486107⟩
34 Consultations
191 Téléchargements

Partager

Gmail Facebook X LinkedIn More