Cross-Platform Evaluation for Italian Hate Speech Detection

Abstract : English. Despite the number of approaches recently proposed in NLP for detecting abusive language on social networks , the issue of developing hate speech detection systems that are robust across different platforms is still an unsolved problem. In this paper we perform a comparative evaluation on datasets for hate speech detection in Italian, extracted from four different social media platforms, i.e. Facebook, Twitter, Instagram and What-sApp. We show that combining such platform-dependent datasets to take advantage of training data developed for other platforms is beneficial, although their impact varies depending on the social network under consideration. 1 Italiano. Nonostante si osservi un cre-scente interesse per approcci che identi-fichino il linguaggio offensivo sui social network attraverso l'NLP, la necessità di sviluppare sistemi che mantengano una buona performance anche su piattaforme diverseè ancora un tema di ricerca aper-to. In questo contributo presentiamo una valutazione comparativa su dataset per l'identificazione di linguaggio d'odio pro-venienti da quattro diverse piattaforme: Facebook, Twitter, Instagram and Wha-tsApp. Lo studio dimostra che, combinan-do dataset diversi per aumentare i dati di training, migliora le performance di clas-sificazione, anche se l'impatto varia a se-conda della piattaforma considerata. 1
Document type :
Conference papers
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02381152
Contributor : Serena Villata <>
Submitted on : Wednesday, November 27, 2019 - 10:23:12 AM
Last modification on : Monday, December 2, 2019 - 9:45:29 AM

File

paper22.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02381152, version 1

Collections

Citation

Michele Corazza, Stefano Menini, Elena Cabrio, Sara Tonelli, Serena Villata. Cross-Platform Evaluation for Italian Hate Speech Detection. Proceedings of the Sixth Italian Conference on Computational Linguistics, Nov 2019, Bari, Italy. ⟨hal-02381152⟩

Share

Metrics

Record views

18

Files downloads

5