Visual Graph Analysis for Quality Assessment of Manually Labelled Documents Image Database

Abstract : The context of this paper is the labelling of a document image database in an industrial process. Our work focuses on the quality assessment of a given labelled database. In most practical cases, a database is manually labelled by an operator who has to browse sequentially the images (presented as thumbnails) until the whole database is labelled. This task is very repetitive; moreover the filing plan defining the names and number of classes is often incomplete, which leads to many labelling errors. The question is then to certify if the quality of a labelled batch is good enough to globally accept it. Our objective is to ease and speed up that evaluation that needs up to 1.5 more times than the labelling work itself. We propose an interactive tool for visualizing the data as a graph. That graph enhances similarities between documents as well as the labelling quality. We define criteria on the graph that characterize the three types of errors an operator can do: an image is mislabelled, one class should be split in more pertinent subclasses, several classes should be merged in another. This allows us to focus the operator attention on potential errors. He can then count the errors encountered while auditing the database and assess (or not) the global labelling quality.
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download
Contributor : Romain Giot <>
Submitted on : Tuesday, June 30, 2015 - 4:58:22 PM
Last modification on : Thursday, January 11, 2018 - 6:20:17 AM
Long-term archiving on : Tuesday, April 25, 2017 - 8:36:35 PM


Files produced by the author(s)


  • HAL Id : hal-01170011, version 1


Romain Giot, Romain Bourqui, Nicholas Journet, Anne Vialard. Visual Graph Analysis for Quality Assessment of Manually Labelled Documents Image Database. 13th International Conference on Document Analysis and Recognition (ICDAR 2015), Aug 2015, Tunis, Tunisia. pp.7. ⟨hal-01170011⟩



Record views


Files downloads