An Approach to Model and Predict the Popularity of Online Contents with Explanatory Factors

Abstract : In this paper, we propose a methodology to predict the popularity of online contents. More precisely, rather than trying to infer the popularity of a content itself, we infer the likelihood that a content will be popular. Our approach is rooted in survival analysis where predicting the precise lifetime of an individual is very hard and almost impossible but predicting the likelihood of one's survival longer than a threshold or another individual is possible. We position ourselves in the standpoint of an external observer who has to infer the popularity of a content only using publicly observable metrics, such as the lifetime of a thread, the number of comments, and the number of views. Our goal is to infer these observable metrics, using a set of explanatory factors, such as the number of comments and the number of links in the first hours after the content publication, which are observable by the external observer. We use a Cox proportional hazard regression model that di- vides the distribution function of the observable popularity metric into two components: a) one that can be explained by the given set of explanatory factors (called risk factors) and b) a baseline distribution function that integrates all the factors not taken into account. To validate our proposed approach, we use data sets from two different online discussion forums: dpreview.com, one of the largest online discussion groups providing news and discussion forums about all kinds of digital cameras, and myspace.com, one of the representative online social networking services. On these two data sets we model two different popularity metrics, the lifetime of threads and the number of comments, and show that our approach can predict the lifetime of threads from Dpreview (Myspace) by observing a thread during the first 5∼6 days (24 hours, respectively) and the number of comments of Dpreview threads by observing a thread during first 2∼3 days.
Document type :
Conference papers
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00527135
Contributor : Salamatian Kavé <>
Submitted on : Monday, October 18, 2010 - 11:41:46 AM
Last modification on : Monday, June 24, 2019 - 12:32:04 PM
Long-term archiving on : Friday, October 26, 2012 - 11:25:38 AM

File

wi2010_lee.pdf
Files produced by the author(s)

Identifiers

Citation

Jong Gun Lee, Sue Moon, Kavé Salamatian. An Approach to Model and Predict the Popularity of Online Contents with Explanatory Factors. WI-IAT 2010 - IEEE / WIC / ACM International Conferences on Web Intelligence and Intelligent Agent Technology, Aug 2010, Toronto, Canada. pp.623-630, ⟨10.1109/WI-IAT.2010.209⟩. ⟨hal-00527135⟩

Share

Metrics

Record views

451

Files downloads

792