Multiword Expression Features for Automatic Hate Speech Detection - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Multiword Expression Features for Automatic Hate Speech Detection

Résumé

The task of automatically detecting hate speech in social media is gaining more and more attention. Given the enormous volume of content posted daily, human monitoring of hate speech is unfeasible. In this work, we propose new word-level features for automatic hate speech detection (HSD): multiword expressions (MWEs). MWEs are lexical units greater than a word that have idiomatic and compositional meanings. We propose to integrate MWE features in a deep neural network-based HSD framework. Our baseline HSD system relies on Universal Sentence Encoder (USE). To incorporate MWE features, we create a three-branch deep neural network: one branch for USE, one for MWE categories, and one for MWE embeddings. We conduct experiments on two hate speech tweet corpora with different MWE categories and with two types of MWE embeddings, word2vec and BERT. Our experiments demonstrate that the proposed HSD system with MWE features significantly outperforms the baseline system in terms of macro-F1.
Fichier principal
Vignette du fichier
NLDB_2021___Multiword_Expression_Features_for_Automatic_Hate_Speech_Detection__short_paper___.pdf (135.1 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03231047 , version 1 (21-05-2021)

Identifiants

  • HAL Id : hal-03231047 , version 1

Citer

Nicolas Zampieri, Irina Illina, Dominique Fohr. Multiword Expression Features for Automatic Hate Speech Detection. NLDB 2021 - 26th International Conference on Natural Language & Information Systems, Jun 2021, Saarbrücken/Virtual, Germany. ⟨hal-03231047⟩
83 Consultations
238 Téléchargements

Partager

Gmail Facebook X LinkedIn More