Deep Learning vs. Kernel Methods: Performance for Emotion Prediction in Videos

Abstract : Recently, mainly due to the advances of deep learning, the performances in scene and object recognition have been progressing intensively. On the other hand, more subjective recognition tasks, such as emotion prediction, stagnate at moderate levels. In such context, is it possible to make affective computational models benefit from the breakthroughs in deep learning? This paper proposes to introduce the strength of deep learning in the context of emotion prediction in videos. The two main contributions are as follow: (i) a new dataset, composed of 30 movies under Creative Commons licenses, continuously annotated along the induced valence and arousal axes (publicly available) is introduced, for which (ii) the performance of the Convolutional Neural Networks (CNN) through supervised fine-tuning, the Support Vector Machines for Regression (SVR) and the combination of both (Transfer Learning) are computed and discussed. To the best of our knowledge, it is the first approach in the literature using CNNs to predict dimensional affective scores from videos. The experimental results show that the limited size of the dataset prevents the learning or finetuning of CNN-based frameworks but that transfer learning is a promising solution to improve the performance of affective movie content analysis frameworks as long as very large datasets annotated along affective dimensions are not available.
Type de document :
Communication dans un congrès
Affective Computing and Intelligent Interaction (ACII), Sep 2015, Xi'an, China
Liste complète des métadonnées

Littérature citée [27 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01193144
Contributeur : Yoann Baveye <>
Soumis le : vendredi 4 septembre 2015 - 15:39:21
Dernière modification le : mercredi 13 janvier 2016 - 10:08:35
Document(s) archivé(s) le : samedi 5 décembre 2015 - 12:28:53

Fichier

ACII2013template.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01193144, version 1

Collections

Citation

Yoann Baveye, Emmanuel Dellandréa, Christel Chamaret, Liming Chen. Deep Learning vs. Kernel Methods: Performance for Emotion Prediction in Videos. Affective Computing and Intelligent Interaction (ACII), Sep 2015, Xi'an, China. 〈hal-01193144〉

Partager

Métriques

Consultations de
la notice

231

Téléchargements du document

1355