Deep Learning vs. Kernel Methods: Performance for Emotion Prediction in Videos

Yoann Baveye; Emmanuel Dellandréa; Christel Chamaret; Liming Chen

Communication Dans Un Congrès Année : 2015

Deep Learning vs. Kernel Methods: Performance for Emotion Prediction in Videos

(1, 2) , (1) , (2) , (1)

1
2

Yoann Baveye

Fonction : Auteur
PersonId : 5075
IdHAL : yoann-baveye
IdRef : 191269972

Extraction de Caractéristiques et Identification

Technicolor R & I [Cesson Sévigné]

Emmanuel Dellandréa

Fonction : Auteur
PersonId : 7701
IdHAL : emmanuel-dellandrea
ORCID : 0000-0001-7346-228X
IdRef : 114133034

Extraction de Caractéristiques et Identification

Christel Chamaret

Fonction : Auteur
PersonId : 774850
IdRef : 178625191

Technicolor R & I [Cesson Sévigné]

Liming Chen

Fonction : Auteur
PersonId : 7562
IdHAL : liming-chen
IdRef : 067400175

Extraction de Caractéristiques et Identification

Résumé

Recently, mainly due to the advances of deep learning, the performances in scene and object recognition have been progressing intensively. On the other hand, more subjective recognition tasks, such as emotion prediction, stagnate at moderate levels. In such context, is it possible to make affective computational models benefit from the breakthroughs in deep learning? This paper proposes to introduce the strength of deep learning in the context of emotion prediction in videos. The two main contributions are as follow: (i) a new dataset, composed of 30 movies under Creative Commons licenses, continuously annotated along the induced valence and arousal axes (publicly available) is introduced, for which (ii) the performance of the Convolutional Neural Networks (CNN) through supervised fine-tuning, the Support Vector Machines for Regression (SVR) and the combination of both (Transfer Learning) are computed and discussed. To the best of our knowledge, it is the first approach in the literature using CNNs to predict dimensional affective scores from videos. The experimental results show that the limited size of the dataset prevents the learning or finetuning of CNN-based frameworks but that transfer learning is a promising solution to improve the performance of affective movie content analysis frameworks as long as very large datasets annotated along affective dimensions are not available.

Mots clés

continuous emotion prediction deep learning benchmarking affective computing

Domaines

Multimédia [cs.MM]

Fichier principal

ACII2013template.pdf (1.91 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Yoann Baveye : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01193144

Soumis le : vendredi 4 septembre 2015-15:39:21

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Archivage à long terme le : samedi 5 décembre 2015-12:28:53

Dates et versions

hal-01193144 , version 1 (04-09-2015)

Identifiants

HAL Id : hal-01193144 , version 1

Citer

Yoann Baveye, Emmanuel Dellandréa, Christel Chamaret, Liming Chen. Deep Learning vs. Kernel Methods: Performance for Emotion Prediction in Videos. Affective Computing and Intelligent Interaction (ACII), Sep 2015, Xi'an, China. ⟨hal-01193144⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS LABEXIMU INSA-GROUPE UDL ANR EC_LYON_STRICT

307 Consultations

2380 Téléchargements

Deep Learning vs. Kernel Methods: Performance for Emotion Prediction in Videos

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager