Constructing and maintaining knowledge organization tools: a symbolic approach. - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of Documentation Année : 2006

Constructing and maintaining knowledge organization tools: a symbolic approach.

Résumé

Purpose - To propose a comprehensive and semi-automatic method for constructing or updating knowledge organization tools such as thesauri. Design/methodology/approach - The paper proposes a comprehensive methodology for thesaurus construction and maintenance combining shallow NLP with a clustering algorithm and an information visualization interface. The resulting system TermWatch, extracts terms from a text collection, mines semantic relations between them using complementary linguistic approaches and clusters terms using these semantic relations. The clusters are mapped onto a 2D using an integrated visualization tool. Findings - The clusters formed exhibit the different relations necessary to populate a thesaurus or ontology: synonymy, generic/specific and relatedness. The clusters represent, for a given term, its closest neighbours in terms of semantic relations. Practical implications - This could change the way in which information professionals (librarians and documentalists) undertake knowledge organization tasks. TermWatch can be useful either as a starting point for grasping the conceptual organization of knowledge in a huge text collection without having to read the texts, then actually serving as a suggestive tool for populating different hierarchies of a thesaurus or an ontology because its clusters are based on semantic relations. Originality/value - This lies in several points: combined use of linguistic relations with an adapted clustering algorithm, which is scalable and can handle sparse data. The paper proposes a comprehensive approach to semantic relations acquisition whereas existing studies often use one or two approaches. The domain knowledge maps produced by the system represents an added advantage over existing approaches to automatic thesaurus construction in that clusters are formed using semantic relations between domain terms. Thus while offering a meaningful synthesis of the information contained in the original corpus through clustering, the results can be used for knowledge organization tasks (thesaurus building and ontology population) The system also constitutes a platform for performing several knowledge-oriented tasks like science and technology watch, textmining, query refinement.
Fichier principal
Vignette du fichier
JDOC-final.pdf (406.46 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00636127 , version 1 (26-10-2011)

Identifiants

Citer

Fidelia Ibekwe-Sanjuan. Constructing and maintaining knowledge organization tools: a symbolic approach.. Journal of Documentation, 2006, 62 (2), pp.229-250. ⟨10.1108/00220410610653316⟩. ⟨hal-00636127⟩
270 Consultations
199 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More