Stacked Gender Prediction from Tweet Texts and Images Notebook for PAN at CLEF 2018

Abstract : This paper describes our participation at the PAN 2018 Author Profiling shared task. Given texts and images from some Twitter's authors, the goal is to estimate their genders. We considered all the languages (Arabic, English and Spanish) and all the prediction types (only from texts, only from images and combined). The final submitted system is a stacked classifier composed of two main parts. The first one, based on previous PAN Author Profiling editions, concerns gender prediction from texts. It consists in a pipeline of preprocessing, word n-grams from 1 to 2, TF-IDF with sublinear weighting, Linear Support Vector classification and probability calibration. The second part is formed by different layers of classifiers used for gender estimation from images: four base classifiers (object detection, face recognition, colour histograms, local binary patterns) in the first layer, a meta classifier in the second layer and an aggregation classifier as third layer. Finally, the two gender predictions, from texts and images, feed into the last layer classifier that provides the combined gender predictions.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02013987
Contributor : Elöd Egyed-Zsigmond <>
Submitted on : Monday, February 11, 2019 - 1:03:55 PM
Last modification on : Wednesday, April 3, 2019 - 1:06:14 AM
Long-term archiving on : Sunday, May 12, 2019 - 2:04:41 PM

File

Ciccone_paper_111_vf.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02013987, version 1

Citation

Giovanni Ciccone, Arthur Sultan, Léa Laporte, Elöd Egyed-Zsigmond, Alaa Alhamzeh, et al.. Stacked Gender Prediction from Tweet Texts and Images Notebook for PAN at CLEF 2018. CLEF 2018 - Conference and Labs of the Evaluation, Sep 2018, Avignon, France. 11p. ⟨hal-02013987⟩

Share

Metrics

Record views

21

Files downloads

47