An Annotated Corpus for Sexism Detection in French Tweets

Social media networks have become a space where users are free to relate their opinions and sentiments which may lead to a large spreading of hatred or abusive messages which have to be moderated. This paper presents the first French corpus annotated for sexism detection composed of about 12,000 tweets. In a context of offensive content mediation on social media now regulated by European laws, we think that it is important to be able to detect automatically not only sexist content but also to identify if a message with a sexist content is really sexist (i.e. addressed to a woman or describing a woman or women in general) or is a story of sexism experienced by a woman. This point is the novelty of our annotation scheme. We also propose some preliminary results for sexism detection obtained with a deep learning approach. Our experiments show encouraging results.

Mots clés

sexism detection social media corpus speech acts

Domaines

Intelligence artificielle [cs.AI] Environnements Informatiques pour l'Apprentissage Humain Traitement du texte et du document

Fichier principal

Chiril-et-alLREC12-2020.pdf (285.28 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte
Licence : CC BY NC - Paternité - Pas d'utilisation commerciale

Véronique MORICEAU : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02889035

Soumis le : mardi 13 décembre 2022-14:45:20

Dernière modification le : jeudi 28 mars 2024-14:00:44

Archivage à long terme le : mardi 14 mars 2023-19:19:24

Dates et versions

hal-02889035 , version 1 (13-12-2022)

Licence

Paternité

Identifiants

HAL Id : hal-02889035 , version 1

Citer

Patricia Chiril, Véronique Moriceau, Farah Benamara, Alda Mari, Gloria Origgi, et al.. An Annotated Corpus for Sexism Detection in French Tweets. 12th Conference on Language Resources and Evaluation (LREC 2020), May 2020, online, France. pp.1-7. ⟨hal-02889035⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS UNIV-TLSE2 TICE CNRS CDF UNIV-MONTP3 EHESS DEC SMS LERASS UT1-CAPITOLE PSL IRIT IRIT-MELODI JEAN-NICOD ANR IRIT-IA IRIT-UT3 TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP LABEX-SMS

125 Consultations

76 Téléchargements