Skip to Main content Skip to Navigation
Conference papers

Iterative Analysis of Pages in Document Collections for Efficient User Interaction

Abstract : The analysis of sets of degraded documents, like historical ones, is error-prone and requires human help to achieve acceptable quality levels. However, human interaction raises 3 main issues when processing important amounts of pages: none of the user or the system should wait for work; information provided by a human operator should not be restricted to local isolated corrections, but rather produce durable changes in the system; the ability to interact with a human operator should not increase the complexity of document models nor duplicate them between analysis and human interaction processes. To solve those issues, we propose an iterative approach, based on a special mechanism called visual memory, to reintegrate external information during page analysis. So as to demonstrate the interest for existing systems, we explain how we adapted a (rule-based) page analysis tool to enable, in this iterative approach, a delayed interaction with a human operator based on an adaptation of error recovery principles for compilers and the well-known exception handling mechanism. We validated our iterative approach on sales registers from the 18th century.
Complete list of metadata
Contributor : Joseph Chazalon <>
Submitted on : Friday, November 25, 2011 - 3:21:00 PM
Last modification on : Thursday, January 7, 2021 - 4:35:32 PM
Long-term archiving on: : Sunday, February 26, 2012 - 2:31:38 AM



Joseph Chazalon, Bertrand Coüasnon, Aurélie Lemaitre. Iterative Analysis of Pages in Document Collections for Efficient User Interaction. Document Analysis and Recognition (ICDAR), 2011 International Conference on, Sep 2011, China. pp.503 -507, ⟨10.1109/ICDAR.2011.107⟩. ⟨hal-00644927⟩



Record views


Files downloads