Content similarity analysis of written comments under posts in social media - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Content similarity analysis of written comments under posts in social media

Résumé

Written comments to the posts on social media are an important metric to measure the followers' feedback to the content of the posts. But the huge presence of unrelated comments following each post can impact many parts of people engagement as well as the visibility of the actual post. Related comments to a post' s topic usually provide readers more insight into the post content and can attract their attention. On the other hand, unrelated comments distract them from the original topic of the post or disturb them by worthless content and can mislead or impact their opinion. In this paper, we propose an effective framework to measure the similarity of given comments to a post in terms of the content and distinguish the related and unrelated written comments to the actual post. Toward that end, the proposed framework enhances a novel feature engineering by combining a syntactical, topical, and semantical set of features and leveraging word embeddings approach. A machine learning-based classification approach is used to label related and unrelated comments of each post. The proposed framework is evaluated on a dataset of 33,921 comments written under 30 posts from BBC News agency page on Facebook. The evaluation indicates that our model achieves in average the precision of 86% in identifying related and unrelated comments with an improvement of 9.6% in accuracy in comparison with previous work, without relying on the entire article of the posts or external web pages' content related to each post. As a case study, the learned classifier is applied on a bigger dataset of 278,370 comments written under 332 posts and we observed almost 60% of the written comments are not related to the actual posts' content. Investigating the content of both group of related and unrelated comments regarding the topics of their actual posts shows that most of the related comments are objective and they discuss the posts' content in terms of topics whereas unrelated comments usually contain subjective and very general words expressing feedback without any focus on the subject of the posts.
Fichier principal
Vignette du fichier
RC_SNAMS2019_37.pdf (834.96 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02363538 , version 1 (14-11-2019)

Identifiants

Citer

Marzieh Mozafari, Reza Farahbakhsh, Noel Crespi. Content similarity analysis of written comments under posts in social media. SNAMS 2019: 6th International Conference on Social Networks Analysis, Management and Security, Oct 2019, Grenade, Spain. pp.158-165, ⟨10.1109/SNAMS.2019.8931726⟩. ⟨hal-02363538⟩
40 Consultations
479 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More