Document Images Indexing with Relevance Feedback : an Application to Industrial Context

Abstract : This article presents a new method to index document images. This work is done in an industrial context where thousands of document images are daily digitized, these images have to be sorted in different classes like payroll, various bills, information letters. We propose a software method which aims to accelerate this task. Usually, the number of document classes is a priori unknown. In this paper, we propose an automatic estimation of this class number. According to this class number, we use a clustering algorithm in order to group document images. After this step, we propose an assisted classification tool based on content based image retrieval method (CBIR). For each cluster, a reference image is automatically selected then considering a similarity measure, the other images are sorted and shown to the user. By interacting with the process, the user can reject wrong images. The user feedback is automatically taken into account to enhance the similarity measure by selecting features. The first tests show that, on average, databases are indexed 3 times faster with our assisted classification method than with a standard manual classification process.
Type de document :
Communication dans un congrès
ICDAR, Sep 2011, Beijing, China. pp.1190-1194, 2011
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger
Contributeur : Olivier Augereau <>
Soumis le : vendredi 9 décembre 2011 - 10:12:43
Dernière modification le : jeudi 11 janvier 2018 - 06:20:16
Document(s) archivé(s) le : samedi 10 mars 2012 - 02:21:05


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-00649870, version 1



Olivier Augereau, Nicholas Journet, Domenger Jean Philippe. Document Images Indexing with Relevance Feedback : an Application to Industrial Context. ICDAR, Sep 2011, Beijing, China. pp.1190-1194, 2011. 〈hal-00649870〉



Consultations de la notice


Téléchargements de fichiers