Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, EpiSciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Conference papers

Time-continuous Estimation of Emotion in Music with Recurrent Neural Networks

Abstract : In this paper, we describe the IRIT's approach used for the MediaEval 2015 "Emotion in Music" task. The goal was to predict two real-valued emotion dimensions, namely valence and arousal, in a time-continuous fashion. We chose to use recurrent neural networks (RNN) for their sequence modeling capabilities. Hyperparameter tuning was performed through a 10-fold cross-validation setup on the 431 songs of the development subset. With the baseline set of 260 acoustic features, our best system achieved averaged root mean squared errors of 0.250 and 0.238, and Pearson's correlation coefficients of 0.703 and 0.692, for valence and arousal, respectively. These results were obtained by first making predictions with an RNN comprised of only 10 hidden units, smoothed by a moving average filter, and used as input to a second RNN to generate the final predictions. This system gave our best results on the official test data subset for arousal (RMSE=0.247, r=0.588), but not for Valence. Valence predictions were much worse (RMSE=0.365, r=0.029). This may be explained by the fact that in the development subset, valence and arousal values were very correlated (r=0.626), and this was not the case with the test data. Finally, slight improvements over these figures were obtained by adding spectral atness and spectral valley features to the baseline set.
Complete list of metadata

Cited literature [10 references]  Display  Hide  Download
Contributor : Open Archive Toulouse Archive Ouverte (OATAO) Connect in order to contact the contributor
Submitted on : Monday, June 6, 2016 - 2:02:56 PM
Last modification on : Monday, July 4, 2022 - 9:10:12 AM


Files produced by the author(s)


  • HAL Id : hal-01327121, version 1
  • OATAO : 15436


Thomas Pellegrini, Valentin Barrière. Time-continuous Estimation of Emotion in Music with Recurrent Neural Networks. MediaEval 2015 Multimedia Benchmark Workshop (MediaEval 2015), Sep 2015, Wurzen, Germany. pp. 1-3. ⟨hal-01327121⟩



Record views


Files downloads