Tag Similarity in Folksonomies

Abstract : Folksonomies - collections of user-contributed tags, proved to be efficient in reducing the inherent semantic gap when retrieving web contents. To get best use of folksonomies, tag clustering was proposed to address the problems implied by free-style user tagging, such as lexical variations, tag split, multilingualism, etc. In this paper, we propose a novel approach for identifying similar tags in folksonomies. It is based on the idea that in folksonomies, the most frequent tags can be used to identify groups of semantically related tags. For this purpose, frequent tags are identified and their co-occurrence statistics are used to create a probability distribution for each tag. After that, the frequent tags are clustered based on the distance between their co-occurrence probability distributions. Next, probability distributions for the less frequent tags are generated based on the co-occurrence with the clusters of most frequent tags. Finally, similar tags are identified by calculating the distance between the corresponding probability distributions. To that end, we propose an extension for Jensen-Shannon Divergence which is sensitive for the size of the sample from which the co-occurrence probability distributions are calculated. We evaluated our approach by applying it on folksonomies obtained from Flickr. Additionally, we compared our results to that which were produced by a traditional method for tag clustering. The adversary method identifies similar tags by calculating the cosine similarity between the co-occurrence vectors of the tags. The evaluation shows promising results and emphasizes the advantage of our approach.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01339168
Contributor : Équipe Gestionnaire Des Publications Si Liris <>
Submitted on : Wednesday, June 29, 2016 - 3:47:30 PM
Last modification on : Friday, January 11, 2019 - 5:09:21 PM

Identifiers

  • HAL Id : hal-01339168, version 1

Citation

Hatem Mousselly-Sergieh, Elod Egyed-Zsigmond, Gabriele Gianini, Mario Döller, Harald Kosch, et al.. Tag Similarity in Folksonomies. INFORSID 2013, May 2013, Paris, France. pp.319-334. ⟨hal-01339168⟩

Share

Metrics

Record views

160