An Approach to Model and Predict the Popularity of Online Contents with Explanatory Factors - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

An Approach to Model and Predict the Popularity of Online Contents with Explanatory Factors

Résumé

In this paper, we propose a methodology to predict the popularity of online contents. More precisely, rather than trying to infer the popularity of a content itself, we infer the likelihood that a content will be popular. Our approach is rooted in survival analysis where predicting the precise lifetime of an individual is very hard and almost impossible but predicting the likelihood of one's survival longer than a threshold or another individual is possible. We position ourselves in the standpoint of an external observer who has to infer the popularity of a content only using publicly observable metrics, such as the lifetime of a thread, the number of comments, and the number of views. Our goal is to infer these observable metrics, using a set of explanatory factors, such as the number of comments and the number of links in the first hours after the content publication, which are observable by the external observer. We use a Cox proportional hazard regression model that di- vides the distribution function of the observable popularity metric into two components: a) one that can be explained by the given set of explanatory factors (called risk factors) and b) a baseline distribution function that integrates all the factors not taken into account. To validate our proposed approach, we use data sets from two different online discussion forums: dpreview.com, one of the largest online discussion groups providing news and discussion forums about all kinds of digital cameras, and myspace.com, one of the representative online social networking services. On these two data sets we model two different popularity metrics, the lifetime of threads and the number of comments, and show that our approach can predict the lifetime of threads from Dpreview (Myspace) by observing a thread during the first 5∼6 days (24 hours, respectively) and the number of comments of Dpreview threads by observing a thread during first 2∼3 days.
Fichier principal
Vignette du fichier
wi2010_lee.pdf (2.78 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00527135 , version 1 (18-10-2010)

Identifiants

Citer

Jong Gun Lee, Sue Moon, Kavé Salamatian. An Approach to Model and Predict the Popularity of Online Contents with Explanatory Factors. WI-IAT 2010 - IEEE / WIC / ACM International Conferences on Web Intelligence and Intelligent Agent Technology, Aug 2010, Toronto, Canada. pp.623-630, ⟨10.1109/WI-IAT.2010.209⟩. ⟨hal-00527135⟩
260 Consultations
614 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More