Skip to Main content Skip to Navigation
Conference papers

Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations

Abstract : In recent years, online social media have allowed people to meet and discuss world-wide. These popular platforms are confronted with increasing abusive content. In order to automate the detection of abusive content in such social media, researchers have proposed various methods based on Natural Language Processing (NLP), and have leveraged behavioral information about users and the structure of conversations. In our previous work, we proposed to combine NLP and conversational graph-based features to detect abusive messages in chat logs extracted from an online game. These conversational graphs model interactions between users (i.e. who is arguing with whom?), while completely ignoring the language content of the messages. We characterized the structure of these graphs by computing a large set of manually selected topological measures, and used them as features to train a classifier into detecting abusive messages. Graph embedding methods allow representing graphs as low-dimensional vectors while preserving at least a part of their topological properties. In addition to the plain structure, certain methods are able to capture additional information such as node labels or the weight and direction of edges. These representations are automatically learned, so they have the advantage of not requiring to perform any feature selection or feature engineering. One can distinguish four main categories of graph embedding methods, depending on the nature of the considered objects: node, edge, subgraph and whole-graph embeddings. Each category better fits the needs of different applications and problems. In this paper, we focus on the information that is used in addition to the plain structure by some embedding approaches. Especially, we study the impact of the node labels that are used by Graph2vec, a whole-graph embedding method. We study the effectiveness of such additional information in the context of online abuse detection.
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-02993571
Contributor : Noé Cécillon Connect in order to contact the contributor
Submitted on : Friday, November 6, 2020 - 7:44:06 PM
Last modification on : Friday, November 12, 2021 - 11:18:03 AM

Licence


Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike 4.0 International License

Identifiers

  • HAL Id : hal-02993571, version 1

Collections

Citation

Noé Cecillon, Richard Dufour, Vincent Labatut, Georges Linares. Tuning Graph2vec with Node Labels for Abuse Detection in Online Conversations. 11ème Conférence Modèles & Analyse de Réseaux : approches mathématiques et informatiques (MARAMI), Oct 2020, Montpellier, France. ⟨hal-02993571⟩

Share

Metrics

Record views

190

Files downloads

91