Automating opinion analysis in film reviews : the case of statistic versus linguistic approach

Damien Poirier 1, 2 Cécile Bothorel 3 Emilie Guimier de Neef 1 Marc Boullé 1
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
Abstract : Community sites are by nature dedicated places to express and publish opinions. is an example of participative web site, with dozens of millions of enthusiasts sharing their feelings/views on movies, providing positive feedback as well as vivid critics. For anyone interested in understanding net user expectations, such web sites are of major importance because they offer the opportunity to probe huge volume of user generated contents. But to actually benefit from those large amount of data, one has to be able to automatically extract users opinions. This is the challenge we tackle in this paper. Our goal is to exploit the various reviews written by a user in order to compute a model which can then be used to predict the user's verdict on a movie. We explore two different methods to extract opinions. The first one relies on a machine learning technique based on a naive bayesian classifier. The second method consists in applying NLP techniques to process opinions and build dictionaries: those dictionaries are then used to determine the polarity of a comment given the words it may contain. We did apply those two approaches to contents from : the results we provide enable us to discern the most appropriate approach for a given set of data.
Document type :
Conference papers
Liste complète des métadonnées
Contributor : Damien Poirier <>
Submitted on : Tuesday, March 23, 2010 - 4:37:17 PM
Last modification on : Monday, February 25, 2019 - 3:14:12 PM
Document(s) archivé(s) le : Friday, June 25, 2010 - 12:04:09 PM


Files produced by the author(s)


  • HAL Id : hal-00466402, version 1


Damien Poirier, Cécile Bothorel, Emilie Guimier de Neef, Marc Boullé. Automating opinion analysis in film reviews : the case of statistic versus linguistic approach. Language Resources and Evaluation Conference 2008, May 2008, Morocco. pp.Pages 94-101. ⟨hal-00466402⟩



Record views


Files downloads