Visual Saliency and Terminology Extraction for Document Classification

Benjamin Duthil; Mickaël Coustaty; Vincent Courboulay; Jean-Marc Ogier

doi:10.1007/978-3-662-44854-0_8

Communication Dans Un Congrès Année : 2014

Visual Saliency and Terminology Extraction for Document Classification

(1) , (1) , (1) , (1)

Benjamin Duthil

Fonction : Auteur

Laboratoire Informatique, Image et Interaction - EA 2118

Mickaël Coustaty

Fonction : Auteur
PersonId : 2462
IdHAL : mickael-coustaty
ORCID : 0000-0002-0123-439X
IdRef : 160560268

Laboratoire Informatique, Image et Interaction - EA 2118

Vincent Courboulay

Fonction : Auteur
PersonId : 7043
IdHAL : vincent-courboulay
IdRef : 073983616

Laboratoire Informatique, Image et Interaction - EA 2118

Jean-Marc Ogier

Fonction : Auteur
PersonId : 860163

Laboratoire Informatique, Image et Interaction - EA 2118

Résumé

The document digitization process becomes a crucial economical issue in our society. Then, it becomes necessary to be able to organize this huge amount of documents. The work proposed in this paper tends to propose a new method to automatically classify documents using a saliency-based segmentation process on one hand, and a terminology extraction and annotation on the other hand. The saliency-based segmentation is used to extract salient regions and by the way logo, while the terminology approach is used to annotate them and to automatically classify the document. The approach does not require human expertise, and use Google Images as a knowledge database. The results obtained on a real database of 1766 documents show the relevance of the approach.

Domaines

Informatique [cs]

Mickaël Coustaty : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01247935

Soumis le : mercredi 23 décembre 2015-10:35:55

Dernière modification le : jeudi 12 mai 2022-15:36:20

Dates et versions

hal-01247935 , version 1 (23-12-2015)

Identifiants

HAL Id : hal-01247935 , version 1
DOI : 10.1007/978-3-662-44854-0_8

Citer

Benjamin Duthil, Mickaël Coustaty, Vincent Courboulay, Jean-Marc Ogier. Visual Saliency and Terminology Extraction for Document Classification. Graphic Recognition, Bart Lamiroy, Aug 2013, Bethlehem, PA, USA, United States. pp.96-108, ⟨10.1007/978-3-662-44854-0_8⟩. ⟨hal-01247935⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

L3I UNIV-ROCHELLE

87 Consultations

0 Téléchargements

Visual Saliency and Terminology Extraction for Document Classification

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager