Deep Temporal Pyramid Design For Action Recognition - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Deep Temporal Pyramid Design For Action Recognition

Résumé

Deep convolutional neural networks (CNNs) are nowadays achieving significant leaps in different pattern recognition tasks including action recognition. Current CNNs are increasingly deeper, data-hungrier and this makes their success tributary of the abundance of labeled training data. CNNs also rely on max/average pooling which reduces dimensionality of output layers and hence attenuates their sensitivity to the availability of labeled data. However, this process may dilute the information of upstream convolutional layers and thereby affect the discrimination power of the trained representations, especially when the learned categories are fine-grained. In this paper, we introduce a novel hierarchical aggregation design, for final pooling, that controls granularity of the learned representations w.r. t the actual granularity of action categories. Our solution is based on a tree-structured temporal pyramid that aggregates outputs of CNNs at different levels. Top levels of this hierarchy are dedicated to coarse categories while deep levels are more suitable to fine-grained ones. The design of our temporal pyramid is based on solving a constrained minimization problem whose solution corresponds to the distribution of weights of different representations in the temporal pyramid. Experiments conducted using the challenging UCF101 database show the relevance of our hierarchical design w.r. t other related methods.
Fichier principal
Vignette du fichier
icassp19.pdf (753.98 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03089627 , version 1 (28-12-2020)

Identifiants

Citer

Ahmed Mazari, Hichem Sahbi. Deep Temporal Pyramid Design For Action Recognition. IEEE International Conference on Acoustic, Speech and Signal Processing, ICASSP, May 2019, Brighton, United Kingdom. pp.2077-2081, ⟨10.1109/ICASSP.2019.8683035⟩. ⟨hal-03089627⟩
42 Consultations
115 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More