Skip to Main content Skip to Navigation
Conference papers

Iterative Analysis of Pages in Document Collections for Efficient User Interaction

Abstract : The analysis of sets of degraded documents, like historical ones, is error-prone and requires human help to achieve acceptable quality levels. However, human interaction raises 3 main issues when processing important amounts of pages: none of the user or the system should wait for work; information provided by a human operator should not be restricted to local isolated corrections, but rather produce durable changes in the system; the ability to interact with a human operator should not increase the complexity of document models nor duplicate them between analysis and human interaction processes. To solve those issues, we propose an iterative approach, based on a special mechanism called visual memory, to reintegrate external information during page analysis. So as to demonstrate the interest for existing systems, we explain how we adapted a (rule-based) page analysis tool to enable, in this iterative approach, a delayed interaction with a human operator based on an adaptation of error recovery principles for compilers and the well-known exception handling mechanism. We validated our iterative approach on sales registers from the 18th century.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00644927
Contributor : Joseph Chazalon <>
Submitted on : Friday, November 25, 2011 - 3:21:00 PM
Last modification on : Friday, March 6, 2020 - 4:32:02 PM
Document(s) archivé(s) le : Sunday, February 26, 2012 - 2:31:38 AM

Identifiers

Citation

Joseph Chazalon, Bertrand Coüasnon, Aurélie Lemaitre. Iterative Analysis of Pages in Document Collections for Efficient User Interaction. Document Analysis and Recognition (ICDAR), 2011 International Conference on, Sep 2011, China. pp.503 -507, ⟨10.1109/ICDAR.2011.107⟩. ⟨hal-00644927⟩

Share

Metrics

Record views

545

Files downloads

261