Multi-view hac for Semi-supervised Document Image Classification

Abstract : This paper presents a semi-supervised document image classification system that aims to be integrated into a commercial document reading software. This system is asserted like an annotation help. From a set of unknown document images given by a human operator, the system computes regrouping hypothesis of same physical layout images and proposes them to the operator. Then he can correct them, validate them, keeping in mind that his objective is to have homogeneous groups of images. These groups will be used for the training of the supervised document image classifier. Our system contains N feature spaces and a metric function for each of them. These allow to compute the similarity between two points of the same space. After projecting each image in these N feature spaces, the system builds N hierarchical agglomerative classification trees (hac) corresponding to each feature space. The proposals for regroupings formulated by the various hac are confronted and merged. Results, evaluated by the number of corrections done by the operator are presented on different image sets.
Type de document :
Communication dans un congrès
Simone Marinai and Andreas R. Dengel. 6th International Workshop on Document Analysis Systems, 2004, Florence, Italy. Springer Verlag, 3163, pp.191--200, 2004, Lecture Notes in Computer Science. 〈10.1007/978-3-540-28640-0_18〉
Liste complète des métadonnées

Littérature citée [8 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00637062
Contributeur : Pierre Héroux <>
Soumis le : samedi 29 octobre 2011 - 17:09:23
Dernière modification le : mercredi 11 octobre 2017 - 11:18:03
Document(s) archivé(s) le : lundi 30 janvier 2012 - 11:19:17

Fichier

das2004.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

Collections

Citation

Fabien Carmagnac, Pierre Héroux, Eric Trupin. Multi-view hac for Semi-supervised Document Image Classification. Simone Marinai and Andreas R. Dengel. 6th International Workshop on Document Analysis Systems, 2004, Florence, Italy. Springer Verlag, 3163, pp.191--200, 2004, Lecture Notes in Computer Science. 〈10.1007/978-3-540-28640-0_18〉. 〈hal-00637062〉

Partager

Métriques

Consultations de
la notice

92

Téléchargements du document

65