Stacked Gender Prediction from Tweet Texts and Images Notebook for PAN at CLEF 2018

Giovanni Ciccone; Arthur Sultan; Léa Laporte; Elod Egyed-Zsigmond; Alaa Alhamzeh; Michael Granitzer

Communication Dans Un Congrès Année : 2018

Stacked Gender Prediction from Tweet Texts and Images Notebook for PAN at CLEF 2018

, , (1) , (1) , (1, 2) , (3)

1
2
3

Giovanni Ciccone

Fonction : Auteur
PersonId : 1042643

Arthur Sultan

Fonction : Auteur

Léa Laporte

Fonction : Auteur
PersonId : 3200
IdHAL : lea-laporte
ORCID : 0000-0001-5227-2735
IdRef : 180044990

Distribution, Recherche d'Information et Mobilité

Elod Egyed-Zsigmond

Fonction : Auteur
PersonId : 4181
IdHAL : elod-egyed-zsigmond
ORCID : 0000-0002-1218-8026
IdRef : 083789138

Distribution, Recherche d'Information et Mobilité

Alaa Alhamzeh

Fonction : Auteur
PersonId : 1235127
IdHAL : alaa-alhamzeh

Distribution, Recherche d'Information et Mobilité

Laboratoire d'InfoRmatique en Image et Systèmes d'information

Michael Granitzer

Fonction : Auteur

Know-Center Graz

Résumé

This paper describes our participation at the PAN 2018 Author Profiling shared task. Given texts and images from some Twitter's authors, the goal is to estimate their genders. We considered all the languages (Arabic, English and Spanish) and all the prediction types (only from texts, only from images and combined). The final submitted system is a stacked classifier composed of two main parts. The first one, based on previous PAN Author Profiling editions, concerns gender prediction from texts. It consists in a pipeline of preprocessing, word n-grams from 1 to 2, TF-IDF with sublinear weighting, Linear Support Vector classification and probability calibration. The second part is formed by different layers of classifiers used for gender estimation from images: four base classifiers (object detection, face recognition, colour histograms, local binary patterns) in the first layer, a meta classifier in the second layer and an aggregation classifier as third layer. Finally, the two gender predictions, from texts and images, feed into the last layer classifier that provides the combined gender predictions.

Mots clés

gender prediction Tweet analysis image based classification

Domaines

Traitement du texte et du document Recherche d'information [cs.IR]

Fichier principal

Ciccone_paper_111_vf.pdf (296.38 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Előd Egyed-Zsigmond : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02013987

Soumis le : lundi 11 février 2019-13:03:55

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Archivage à long terme le : dimanche 12 mai 2019-14:04:41

Dates et versions

hal-02013987 , version 1 (11-02-2019)

Identifiants

HAL Id : hal-02013987 , version 1

Citer

Giovanni Ciccone, Arthur Sultan, Léa Laporte, Elod Egyed-Zsigmond, Alaa Alhamzeh, et al.. Stacked Gender Prediction from Tweet Texts and Images Notebook for PAN at CLEF 2018. CLEF 2018 - Conference and Labs of the Evaluation, Sep 2018, Avignon, France. 11p. ⟨hal-02013987⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS INSA-GROUPE UDL

169 Consultations

203 Téléchargements

Stacked Gender Prediction from Tweet Texts and Images Notebook for PAN at CLEF 2018

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager