Automatic annotation extension and classification of documents using a probabilistic graphical model
Résumé
With the fast growth of document images, the domain of document annotations has become a
research area of great interest. Annotations allow to describe the semantic content of documents, and facilitate
the use and research task for the user. However, for a huge number of documents it is a tedious task
to annote each document manually. A solution is to annote a small part of the documents and to extend
this annotation automatically to the whole dataset. In this paper, we propose a model for annotation extension
and for documents classification using a probabilistic graphical model. In this model, we combine visual and
textual characteristics and we show that the integration of the user feedback improves significantly the results.