Context and Keyword Extraction in Plain Text Using a Graph Representation

Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist indexing specialized documents, experience plays an important role. But indexing documents on different topics is much harder. This article proposes an innovative method for an indexing support system. This system takes as input an ontology and a plain text document and provides as output contextualized keywords of the document. The method has been evaluated by exploiting Wikipedia's category links as a termino-ontological resources.

Mots clés

Graph Knowledge representation Web semantic

Domaines

Recherche d'information [cs.IR]

Fichier principal

latex8.pdf (345.08 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Carlo Abi Chahine : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00439215

Soumis le : lundi 7 décembre 2009-17:42:23

Dernière modification le : vendredi 22 décembre 2023-15:16:05

Archivage à long terme le : jeudi 17 juin 2010-23:20:01

Dates et versions

hal-00439215 , version 1 (07-12-2009)

Identifiants

HAL Id : hal-00439215 , version 1
ARXIV : 0912.1421

Citer

Carlo Abi Chahine, Nathalie Chaignaud, Jean-Philippe Kotowicz, Jean-Pierre Pécuchet. Context and Keyword Extraction in Plain Text Using a Graph Representation. IEEE International Conference on Signal Image Technology and Internet Based Systems, SITIS '08., Nov 2008, Bali, Indonesia. pp.692-696. ⟨hal-00439215⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSA-ROUEN LITIS COMUE-NORMANDIE UNIROUEN UNILEHAVRE INSA-GROUPE

72 Consultations

497 Téléchargements