Human Action Recognition: Pose-based Attention draws focus to Hands

Abstract : We propose a new spatio-temporal attention based mechanism for human action recognition able to automatically attend to most important human hands and detect the most discriminative moments in an action. Attention is handled in a recurrent manner employing Recurrent Neural Network (RNN) and is fully-differentiable. In contrast to standard soft-attention based mechanisms, our approach does not use the hidden RNN state as input to the attention model. Instead, attention distributions are drawn using external information: human articulated pose. We performed an extensive ablation study to show the strengths of this approach and we particularly studied the conditioning aspect of the attention mechanism. We evaluate the method on the largest currently available human action recognition dataset, NTU-RGB+D, and report state-of-the-art results. Another advantage of our model are certains aspects of explanability, as the spatial and temporal attention distributions at test time allow to study and verify on which parts of the input data the method focuses.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01575390
Contributor : Christian Wolf <>
Submitted on : Saturday, August 19, 2017 - 3:40:57 PM
Last modification on : Tuesday, July 2, 2019 - 4:02:03 PM

Identifiers

  • HAL Id : hal-01575390, version 1

Citation

Fabien Baradel, Christian Wolf, Julien Mille. Human Action Recognition: Pose-based Attention draws focus to Hands. ICCV Workshop on Hands in Action, Oct 2017, Venice, Italy. ⟨hal-01575390⟩

Share

Metrics

Record views

438