Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations

Noé Cecillon
Richard Dufour
Vincent Labatut
Georges Linares

Résumé

In recent years, online social media have allowed people to meet and discuss world-wide. These popular platforms are confronted with increasing abusive content. In order to automate the detection of abusive content in such social media, researchers have proposed various methods based on Natural Language Processing (NLP), and have leveraged behavioral information about users and the structure of conversations. In our previous work, we proposed to combine NLP and conversational graph-based features to detect abusive messages in chat logs extracted from an online game. These conversational graphs model interactions between users (i.e. who is arguing with whom?), while completely ignoring the language content of the messages. We characterized the structure of these graphs by computing a large set of manually selected topological measures, and used them as features to train a classifier into detecting abusive messages. Graph embedding methods allow representing graphs as low-dimensional vectors while preserving at least a part of their topological properties. In addition to the plain structure, certain methods are able to capture additional information such as node labels or the weight and direction of edges. These representations are automatically learned, so they have the advantage of not requiring to perform any feature selection or feature engineering. One can distinguish four main categories of graph embedding methods, depending on the nature of the considered objects: node, edge, subgraph and whole-graph embeddings. Each category better fits the needs of different applications and problems. In this paper, we focus on the information that is used in addition to the plain structure by some embedding approaches. Especially, we study the impact of the node labels that are used by Graph2vec, a whole-graph embedding method. We study the effectiveness of such additional information in the context of online abuse detection.
Fichier principal
Vignette du fichier
manuscript.pdf (151.3 Ko) Télécharger le fichier
présentation.pdf (81.88 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02993571 , version 1 (06-11-2020)

Licence

Paternité - Pas d'utilisation commerciale - Partage selon les Conditions Initiales

Identifiants

  • HAL Id : hal-02993571 , version 1

Citer

Noé Cecillon, Richard Dufour, Vincent Labatut, Georges Linares. Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations. 11ème Conférence Modèles & Analyse de Réseaux : approches mathématiques et informatiques (MARAMI), Oct 2020, Montpellier, France. ⟨hal-02993571⟩

Collections

UNIV-AVIGNON LIA
245 Consultations
109 Téléchargements

Partager

Gmail Facebook X LinkedIn More