Learning Word Importance with the Neural Bag-of-Words Model

Imran Sheikh 1 Irina Illina 1 Dominique Fohr 1 Georges Linares 2
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : The Neural Bag-of-Words (NBOW) model performs classification with an average of the input word vectors and achieves an impressive performance. While the NBOW model learns word vectors targeted for the classification task it does not explicitly model which words are important for given task. In this paper we propose an improved NBOW model with this ability to learn task specific word importance weights. The word importance weights are learned by introducing a new weighted sum composition of the word vectors. With experiments on standard topic and sentiment classification tasks, we show that (a) our proposed model learns meaningful word importance for a given task (b) our model gives best accuracies among the BOW approaches. We also show that the learned word importance weights are comparable to tf-idf based word weights when used as features in a BOWSVM classifier.
Type de document :
Communication dans un congrès
ACL, Representation Learning for NLP (Repl4NLP) workshop, Aug 2016, Berlin, Germany. Proceedings of ACL 2016
Liste complète des métadonnées

Littérature citée [40 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01331720
Contributeur : Dominique Fohr <>
Soumis le : jeudi 20 octobre 2016 - 09:52:52
Dernière modification le : mardi 18 décembre 2018 - 16:38:02

Fichier

repl4nlp_draft21Jun16.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01331720, version 1

Collections

Citation

Imran Sheikh, Irina Illina, Dominique Fohr, Georges Linares. Learning Word Importance with the Neural Bag-of-Words Model. ACL, Representation Learning for NLP (Repl4NLP) workshop, Aug 2016, Berlin, Germany. Proceedings of ACL 2016. 〈hal-01331720〉

Partager

Métriques

Consultations de la notice

718

Téléchargements de fichiers

428