Normalization of Term Weighting Scheme for Sentiment Analysis

Alexander Pak; Patrick Paroubek; Amel Fraisse; Gil Francopoulo

doi:10.1007/978-3-319-08958-4_10

Chapitre D'ouvrage Année : 2014

Normalization of Term Weighting Scheme for Sentiment Analysis

(1) , (1) , (1) , (2)

1
2

Alexander Pak

Fonction : Auteur

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Patrick Paroubek

Fonction : Auteur
PersonId : 20704
IdHAL : patrick-paroubek
ORCID : 0000-0002-4302-1894
IdRef : 057218730

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Amel Fraisse

Fonction : Auteur
PersonId : 15486
IdHAL : amel-fraisse
ORCID : 0000-0002-8693-8862
IdRef : 146155580

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Gil Francopoulo

Fonction : Auteur
PersonId : 837227

Tagmatica

Résumé

The n-gram model with a binary (or tf-idf) weighting scheme and an SVM classifier is a common approach which is used as a baseline in a lot of research on sentiment analysis and opinion mining. Other advanced methods are used on top of this model to improve the classification accuracy, such as generation of additional features or using supplementary linguistic resources. In this paper , we show how a simple technique can improve both the overall classification accuracy and the classification of minor reviews by normalizing the terms weights in the basic bag-of-words method. Other systems may benefit from this method if they are based on the n-gram model. We have tested our approach on the movie review and the product review datasets and show that our normalization technique enhances the classification accuracy of the traditional weighting schemes. In this paper, we work on English, however the applied technique should be considered language independent since it does not use any language specific ressource except a training corpus. Though, the question remains whether we would observe similar performance increases for other language families.

Domaines

Informatique [cs] Informatique et langage [cs.CL] Intelligence artificielle [cs.AI]

Amel Fraisse : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01618421

Soumis le : mardi 17 octobre 2017-20:57:40

Dernière modification le : samedi 7 octobre 2023-21:36:20

Dates et versions

hal-01618421 , version 1 (17-10-2017)

Identifiants

HAL Id : hal-01618421 , version 1
DOI : 10.1007/978-3-319-08958-4_10

Citer

Alexander Pak, Patrick Paroubek, Amel Fraisse, Gil Francopoulo. Normalization of Term Weighting Scheme for Sentiment Analysis. Human Language technology Challenges for Computer Science and Linguistics, Vol. 8387, Springer, Cham, 2014, Lecture Notes in Artificial Intelligence, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_10⟩. ⟨hal-01618421⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LILLE3 CNRS LIMSI GERIICO SORBONNE-UNIVERSITE LISN

145 Consultations

0 Téléchargements

Normalization of Term Weighting Scheme for Sentiment Analysis

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager