Automating opinion analysis in film reviews : the case of statistic versus linguistic approach

Damien Poirier 1, 2 Cécile Bothorel 3 Emilie Guimier de Neef 1 Marc Boullé 1
3 Lab-STICC_TB_CID_DECIDE
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
Abstract : Community sites are by nature dedicated places to express and publish opinions. www.flixster.com is an example of participative web site, with dozens of millions of enthusiasts sharing their feelings/views on movies, providing positive feedback as well as vivid critics. For anyone interested in understanding net user expectations, such web sites are of major importance because they offer the opportunity to probe huge volume of user generated contents. But to actually benefit from those large amount of data, one has to be able to automatically extract users opinions. This is the challenge we tackle in this paper. Our goal is to exploit the various reviews written by a user in order to compute a model which can then be used to predict the user's verdict on a movie. We explore two different methods to extract opinions. The first one relies on a machine learning technique based on a naive bayesian classifier. The second method consists in applying NLP techniques to process opinions and build dictionaries: those dictionaries are then used to determine the polarity of a comment given the words it may contain. We did apply those two approaches to contents from flixster.com : the results we provide enable us to discern the most appropriate approach for a given set of data.
Type de document :
Communication dans un congrès
Language Resources and Evaluation Conference 2008, May 2008, Morocco. pp.Pages 94-101, 2008
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-00466402
Contributeur : Damien Poirier <>
Soumis le : mardi 23 mars 2010 - 16:37:17
Dernière modification le : jeudi 7 février 2019 - 16:20:45
Document(s) archivé(s) le : vendredi 25 juin 2010 - 12:04:09

Fichier

PoirierEtAlLREC08.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00466402, version 1

Citation

Damien Poirier, Cécile Bothorel, Emilie Guimier de Neef, Marc Boullé. Automating opinion analysis in film reviews : the case of statistic versus linguistic approach. Language Resources and Evaluation Conference 2008, May 2008, Morocco. pp.Pages 94-101, 2008. 〈hal-00466402〉

Partager

Métriques

Consultations de la notice

351

Téléchargements de fichiers

414