Spatio-Temporal Convolutional Sparse Auto-Encoder for Sequence Classification

Moez Baccouche 1 Franck Mamalet Christian Wolf 1 Christophe Garcia 1 Atilla Baskurt 1
1 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : We present in this paper a novel learning-based approach for video sequence classification. Contrary to the dominant methodology, which relies on hand-crafted features that are manually engineered to be optimal for a specific task, our neural model automatically learns a sparse shift-invariant representation of the local 2D+t salient information, without any use of prior knowledge. To that aim, a spatio-temporal convolutional sparse auto-encoder is trained to project a given input in a feature space, and to reconstruct it from its projection coordinates. Learning is performed in an unsupervised manner by minimizing a global parametrized objective function. The sparsity is ensured by adding a sparsifying logistic between the encoder and the decoder, while the shift-invariance is handled by including an additional hidden variable to the objective function. The temporal evolution of the obtained sparse features is learned by a long short-term memory recurrent neural network trained to classify each sequence. We show that, since the feature learning process is problem-independent, the model achieves outstanding performances when applied to two different problems, namely human action and facial expression recognition. Obtained results are superior to the state of the art on the GEMEP-FERA dataset and among the very best on the KTH dataset.
Type de document :
Communication dans un congrès
R. Bowden, J. Collomosse and K. Mikolajczyk. British Machine Vision Conference (BMVC), Sep 2012, Guildford, United Kingdom. BMVA Press, pp.124.1-124.12, 2012, 〈10.5244/C.26.124〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01353046
Contributeur : Équipe Gestionnaire Des Publications Si Liris <>
Soumis le : mercredi 10 août 2016 - 16:20:19
Dernière modification le : mercredi 31 octobre 2018 - 12:24:25

Lien texte intégral

Identifiants

Citation

Moez Baccouche, Franck Mamalet, Christian Wolf, Christophe Garcia, Atilla Baskurt. Spatio-Temporal Convolutional Sparse Auto-Encoder for Sequence Classification. R. Bowden, J. Collomosse and K. Mikolajczyk. British Machine Vision Conference (BMVC), Sep 2012, Guildford, United Kingdom. BMVA Press, pp.124.1-124.12, 2012, 〈10.5244/C.26.124〉. 〈hal-01353046〉

Partager

Métriques

Consultations de la notice

271