Spatio-Temporal Convolutional Sparse Auto-Encoder for Sequence Classification

Moez Baccouche; Franck Mamalet; Christian Wolf; Christophe Garcia; Atilla Baskurt

doi:10.5244/C.26.124

Communication Dans Un Congrès Année : 2012

Spatio-Temporal Convolutional Sparse Auto-Encoder for Sequence Classification

(1) , , (1) , (1) , (1)

Moez Baccouche

Fonction : Auteur

Extraction de Caractéristiques et Identification

Franck Mamalet

Fonction : Auteur
PersonId : 751026
IdHAL : franck-mamalet

Christian Wolf

Fonction : Auteur
PersonId : 3860
IdHAL : christian-wolf
ORCID : 0000-0001-9766-3211
IdRef : 083311696

Extraction de Caractéristiques et Identification

Christophe Garcia

Fonction : Auteur
PersonId : 3989
IdHAL : christophe-garcia
ORCID : 0000-0001-7997-9837
IdRef : 098256599

Extraction de Caractéristiques et Identification

Atilla Baskurt

Fonction : Auteur
PersonId : 4271
IdHAL : atilla-baskurt
ORCID : 0000-0003-1438-0596
IdRef : 075645963

Extraction de Caractéristiques et Identification

Résumé

We present in this paper a novel learning-based approach for video sequence classiﬁcation. Contrary to the dominant methodology, which relies on hand-crafted features that are manually engineered to be optimal for a speciﬁc task, our neural model automatically learns a sparse shift-invariant representation of the local 2D+t salient information, without any use of prior knowledge. To that aim, a spatio-temporal convolutional sparse auto-encoder is trained to project a given input in a feature space, and to reconstruct it from its projection coordinates. Learning is performed in an unsupervised manner by minimizing a global parametrized objective function. The sparsity is ensured by adding a sparsifying logistic between the encoder and the decoder, while the shift-invariance is handled by including an additional hidden variable to the objective function. The temporal evolution of the obtained sparse features is learned by a long short-term memory recurrent neural network trained to classify each sequence. We show that, since the feature learning process is problem-independent, the model achieves outstanding performances when applied to two different problems, namely human action and facial expression recognition. Obtained results are superior to the state of the art on the GEMEP-FERA dataset and among the very best on the KTH dataset.

Domaines

Informatique [cs]

Équipe gestionnaire des publications SI LIRIS : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01353046

Soumis le : mercredi 10 août 2016-16:20:19

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Dates et versions

hal-01353046 , version 1 (10-08-2016)

Identifiants

HAL Id : hal-01353046 , version 1
DOI : 10.5244/C.26.124

Citer

Moez Baccouche, Franck Mamalet, Christian Wolf, Christophe Garcia, Atilla Baskurt. Spatio-Temporal Convolutional Sparse Auto-Encoder for Sequence Classification. British Machine Vision Conference (BMVC), Sep 2012, Guildford, United Kingdom. pp.124.1-124.12, ⟨10.5244/C.26.124⟩. ⟨hal-01353046⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS LABEXIMU INSA-GROUPE UDL

301 Consultations

0 Téléchargements

Spatio-Temporal Convolutional Sparse Auto-Encoder for Sequence Classification

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager