Using Bags of Symbols for Automatic Indexing of Graphical Document Image Databases
Résumé
A database is only usefull if it is associated a set of procedures allowing to retrieve relevant elements for the users' needs. A lot of IR techniques have been developed for automatic indexing and retrieval in document databases. Most of these use indexes depending on the textual content of documents, and very few are able to handle graphical or image content without human annotation. This paper describes an approach similar to the bag of words technique for automatic indexing of graphical document image databases and different ways to consequently query these databases. In an unsupervised manner, this approach proposes a set of automatically discovered symbols that can be combined with logical operators to build queries.
Domaines
Traitement du texte et du document
Origine : Fichiers produits par l'(les) auteur(s)
Loading...