Iterative Analysis of Pages in Document Collections for Efficient User Interaction - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

Iterative Analysis of Pages in Document Collections for Efficient User Interaction

Résumé

The analysis of sets of degraded documents, like historical ones, is error-prone and requires human help to achieve acceptable quality levels. However, human interaction raises 3 main issues when processing important amounts of pages: none of the user or the system should wait for work; information provided by a human operator should not be restricted to local isolated corrections, but rather produce durable changes in the system; the ability to interact with a human operator should not increase the complexity of document models nor duplicate them between analysis and human interaction processes. To solve those issues, we propose an iterative approach, based on a special mechanism called visual memory, to reintegrate external information during page analysis. So as to demonstrate the interest for existing systems, we explain how we adapted a (rule-based) page analysis tool to enable, in this iterative approach, a delayed interaction with a human operator based on an adaptation of error recovery principles for compilers and the well-known exception handling mechanism. We validated our iterative approach on sales registers from the 18th century.
Fichier principal
Vignette du fichier
20110629-final-RC1-pdfexpress-ok-PID1937303.pdf (342.8 Ko) Télécharger le fichier
icdar_talk_interaction-v6-selection-export-hal.pdf (2.23 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Format : Autre

Dates et versions

hal-00644927 , version 1 (25-11-2011)

Identifiants

Citer

Joseph Chazalon, Bertrand Coüasnon, Aurélie Lemaitre. Iterative Analysis of Pages in Document Collections for Efficient User Interaction. Document Analysis and Recognition (ICDAR), 2011 International Conference on, Sep 2011, China. pp.503 -507, ⟨10.1109/ICDAR.2011.107⟩. ⟨hal-00644927⟩
281 Consultations
138 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More