Normalization of Term Weighting Scheme for Sentiment Analysis - Archive ouverte HAL Accéder directement au contenu
Chapitre D'ouvrage Année : 2014

Normalization of Term Weighting Scheme for Sentiment Analysis

Résumé

The n-gram model with a binary (or tf-idf) weighting scheme and an SVM classifier is a common approach which is used as a baseline in a lot of research on sentiment analysis and opinion mining. Other advanced methods are used on top of this model to improve the classification accuracy, such as generation of additional features or using supplementary linguistic resources. In this paper , we show how a simple technique can improve both the overall classification accuracy and the classification of minor reviews by normalizing the terms weights in the basic bag-of-words method. Other systems may benefit from this method if they are based on the n-gram model. We have tested our approach on the movie review and the product review datasets and show that our normalization technique enhances the classification accuracy of the traditional weighting schemes. In this paper, we work on English, however the applied technique should be considered language independent since it does not use any language specific ressource except a training corpus. Though, the question remains whether we would observe similar performance increases for other language families.
Fichier non déposé

Dates et versions

hal-01618421 , version 1 (17-10-2017)

Identifiants

Citer

Alexander Pak, Patrick Paroubek, Amel Fraisse, Gil Francopoulo. Normalization of Term Weighting Scheme for Sentiment Analysis. Human Language technology Challenges for Computer Science and Linguistics, Vol. 8387, Springer, Cham, 2014, Lecture Notes in Artificial Intelligence, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_10⟩. ⟨hal-01618421⟩
145 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More