Finding topic-specific strings in text categorization and opinion mining contexts - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

Finding topic-specific strings in text categorization and opinion mining contexts

Chloé Clavel
EDF
Marc El Bèze
  • Fonction : Auteur
  • PersonId : 949557
Patrice Bellot

Résumé

— In this paper, we present a new probabilistic method for automatically extracting topic-specific strings in a text categorization context. The advantage of this method is twofold. First, it allows us to automatically point out the expressions characterizing a specific topic category for a potential knowledge modelling. Second, it contributes to improve categorization results by providing to the classifier text spans which are more relevant than isolated words. The novelty of our approach relies thus not only on the method used for topic-specific strings extraction but also on the adaptation of the traditional cosine similarity measure for text categorization. We choose for the evaluation to tackle two different challenging corpora: movie reviews of Internet users, and manual transcriptions of call center conversations. On these two tasks, we observed a gain in the categorization results (between 1 and 8%).
Fichier non déposé

Dates et versions

hal-01318044 , version 1 (19-05-2016)

Identifiants

  • HAL Id : hal-01318044 , version 1

Citer

Rémi Lavalley, Chloé Clavel, Marc El Bèze, Patrice Bellot. Finding topic-specific strings in text categorization and opinion mining contexts. DMIN'10 , Jul 2010, Las Vegas, United States. ⟨hal-01318044⟩
43 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More