An Annotated Corpus for Sexism Detection in French Tweets - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

An Annotated Corpus for Sexism Detection in French Tweets

Résumé

Social media networks have become a space where users are free to relate their opinions and sentiments which may lead to a large spreading of hatred or abusive messages which have to be moderated. This paper presents the first French corpus annotated for sexism detection composed of about 12,000 tweets. In a context of offensive content mediation on social media now regulated by European laws, we think that it is important to be able to detect automatically not only sexist content but also to identify if a message with a sexist content is really sexist (i.e. addressed to a woman or describing a woman or women in general) or is a story of sexism experienced by a woman. This point is the novelty of our annotation scheme. We also propose some preliminary results for sexism detection obtained with a deep learning approach. Our experiments show encouraging results.
Fichier principal
Vignette du fichier
Chiril-et-alLREC12-2020.pdf (285.28 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Licence : CC BY NC - Paternité - Pas d'utilisation commerciale

Dates et versions

hal-02889035 , version 1 (13-12-2022)

Licence

Paternité

Identifiants

  • HAL Id : hal-02889035 , version 1

Citer

Patricia Chiril, Véronique Moriceau, Farah Benamara, Alda Mari, Gloria Origgi, et al.. An Annotated Corpus for Sexism Detection in French Tweets. 12th Conference on Language Resources and Evaluation (LREC 2020), May 2020, online, France. pp.1-7. ⟨hal-02889035⟩
125 Consultations
76 Téléchargements

Partager

Gmail Facebook X LinkedIn More