Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations

Noé Cecillon; Richard Dufour; Vincent Labatut; Georges Linares

Communication Dans Un Congrès Année : 2020

Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations

(1) , (1) , (1) , (1)

Noé Cecillon

Fonction : Auteur
PersonId : 182978
IdHAL : noe-cecillon
ORCID : 0000-0002-9889-0931

Laboratoire Informatique d'Avignon

Richard Dufour

Fonction : Auteur
PersonId : 178348
IdHAL : richard-dufour
ORCID : 0000-0003-1203-9108

Laboratoire Informatique d'Avignon

Vincent Labatut

Fonction : Auteur
PersonId : 482
IdHAL : vlabatut
ORCID : 0000-0002-2619-2835
IdRef : 076951375

Laboratoire Informatique d'Avignon

Georges Linares

Fonction : Auteur
PersonId : 4977
IdHAL : georges-linares
IdRef : 079368794

Laboratoire Informatique d'Avignon

Résumé

In recent years, online social media have allowed people to meet and discuss world-wide. These popular platforms are confronted with increasing abusive content. In order to automate the detection of abusive content in such social media, researchers have proposed various methods based on Natural Language Processing (NLP), and have leveraged behavioral information about users and the structure of conversations. In our previous work, we proposed to combine NLP and conversational graph-based features to detect abusive messages in chat logs extracted from an online game. These conversational graphs model interactions between users (i.e. who is arguing with whom?), while completely ignoring the language content of the messages. We characterized the structure of these graphs by computing a large set of manually selected topological measures, and used them as features to train a classifier into detecting abusive messages. Graph embedding methods allow representing graphs as low-dimensional vectors while preserving at least a part of their topological properties. In addition to the plain structure, certain methods are able to capture additional information such as node labels or the weight and direction of edges. These representations are automatically learned, so they have the advantage of not requiring to perform any feature selection or feature engineering. One can distinguish four main categories of graph embedding methods, depending on the nature of the considered objects: node, edge, subgraph and whole-graph embeddings. Each category better fits the needs of different applications and problems. In this paper, we focus on the information that is used in addition to the plain structure by some embedding approaches. Especially, we study the impact of the node labels that are used by Graph2vec, a whole-graph embedding method. We study the effectiveness of such additional information in the context of online abuse detection.

Mots clés

Graph embeddings Conversational networks Abuse Detection

Domaines

Informatique [cs] Réseaux sociaux et d'information [cs.SI]

Fichier principal

manuscript.pdf (151.3 Ko)

présentation.pdf (81.88 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Noé Cécillon : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02993571

Soumis le : vendredi 6 novembre 2020-19:44:06

Dernière modification le : vendredi 12 novembre 2021-11:18:03

Dates et versions

hal-02993571 , version 1 (06-11-2020)

Licence

Paternité - Pas d'utilisation commerciale - Partage selon les Conditions Initiales

Identifiants

HAL Id : hal-02993571 , version 1

Citer

Noé Cecillon, Richard Dufour, Vincent Labatut, Georges Linares. Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations. 11ème Conférence Modèles & Analyse de Réseaux : approches mathématiques et informatiques (MARAMI), Oct 2020, Montpellier, France. ⟨hal-02993571⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON LIA

245 Consultations

109 Téléchargements

Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager