Metadata Propagation in the Web Using Co-citations - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2005

Metadata Propagation in the Web Using Co-citations

Résumé

Given the large heterogeneity of the World Wide Web, using metadata on the search engines side seems to be a useful track for information retrieval. Though, because a manual qualification at the Web scale is not accessible, this track is little followed. We propose a semi-automatic method for propagating metadata. In a first step, homogeneous corpus are extracted. We used in our study the following properties: the authority type, the site type, the information type, and the page type. This first step is realized by a clusterization which uses a similarity measure based on the co-citation frequency between pages. Given the cluster hierarchy, the second step selects a reduced number of documents to be manually qualified and propagates the given metadata values to the other documents belonging to the same cluster. A qualitative evaluation and a preliminary study about the scalability of this method are presented.
Fichier non déposé

Dates et versions

hal-00406859 , version 1 (23-07-2009)

Identifiants

Citer

Camille Prime Claverie, Michel Beigbeder, Thierry Lafouge. Metadata Propagation in the Web Using Co-citations. 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05), Sep 2005, Compiègne, France. pp. 602-605, ISBN 0-7695-2415-X, ⟨10.1109/WI.2005.95⟩. ⟨hal-00406859⟩
60 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More