Automatic creation of a reference corpus for political opinion mining in user-generated content - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2009

Automatic creation of a reference corpus for political opinion mining in user-generated content

Résumé

We propose and evaluate a method for automatically cre-ating a reference corpus for training text classification pro-cedures for mining political opinions in user-generated con-tent. The process starts by compiling a collection of highly opinionated comments posted by users on an on-line news-paper. Then, we define and use a set of manually-crafted high-precision rules supported by a large sentiment-lexicon in order to identify sentences in each comment expressing opinions about political entities. Finally, the opinions found are propagated to the remainder sentences of the comment mentioning the same entities, thus increasing the number and variety of opinion-bearing sentences. Results show that most of the rules can identify negative opinions with very high precision, and these can be safely propagated to the remainder sentences in the comment in almost 100% of the cases. Due to problems arising from irony, the precision of identification drops for positive opinions, but several rules still reach high precision. Propagation of positive opinions is correct in about 77% of the cases, and most errors at this stage result from irony and polarity inversion throughout the comment.
Fichier non déposé

Dates et versions

hal-01109751 , version 1 (28-01-2015)

Identifiants

Citer

Luís Sarmento, Paula Carvalho, Mário J. Silva, Eugénio de Oliveira. Automatic creation of a reference corpus for political opinion mining in user-generated content. Text Sentiment Analysis (TSA'09), 2009, Hong Kong, China. ⟨10.1145/1651461.1651468⟩. ⟨hal-01109751⟩

Collections

LIGM_LINGU_INVITE
49 Consultations
1 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More