Exploiting Deep Residual Networks for Human Action Recognition from Skeletal Data

Huy-Hieu Pham; Louahdi Khoudour; Alain Crouzil; Pablo Zegers; Sergio A. Velastin

doi:10.1016/j.cviu.2018.03.003

Article Dans Une Revue Computer Vision and Image Understanding Année : 2018

Exploiting Deep Residual Networks for Human Action Recognition from Skeletal Data

, (1) , , ,

Huy-Hieu Pham

Fonction : Auteur

Louahdi Khoudour

Fonction : Auteur

Systèmes de transports intelligents

Alain Crouzil

Fonction : Auteur
PersonId : 184761
IdHAL : alain-crouzil
ORCID : 0000-0001-7040-2978

Pablo Zegers

Fonction : Auteur

Sergio A. Velastin

Fonction : Auteur

Résumé

The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges. In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems. Recently, the introduction of residual connections in conjunction with a more traditional CNN model in a single architecture called Residual Network (ResNet) has shown impressive performance and great potential for image recognition tasks. In this paper, we investigate and apply deep ResNets for human action recognition using skeletal data provided by depth sensors. Firstly, the 3D coordinates of the human body joints carried in skeleton sequences are transformed into image-based representations and stored as RGB images. These color images are able to capture the spatial-temporal evolutions of 3D motions from skeleton sequences and can be efficiently learned by D-CNNs. We then propose a novel deep learning architecture based on ResNets to learn features from obtained color-based representations and classify them into action classes. The proposed method is evaluated on three challenging benchmark datasets including MSR Action 3D, KARD, and NTU-RGB+D datasets. Experimental results demonstrate that our method achieves state-of-the-art performance for all these benchmarks whilst requiring less computation resource. In particular, the proposed method surpasses previous approaches by a significant margin of 3.4% on MSR Action 3D dataset, 0.67% on KARD dataset, and 2.5% on NTU-RGB+D dataset.

Domaines

Sciences de l'ingénieur [physics]

Lara Désiré : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02192228

Soumis le : mardi 23 juillet 2019-17:20:28

Dernière modification le : mercredi 27 septembre 2023-19:50:04

Dates et versions

hal-02192228 , version 1 (23-07-2019)

Identifiants

HAL Id : hal-02192228 , version 1
ARXIV : 1803.07781
DOI : 10.1016/j.cviu.2018.03.003

Citer

Huy-Hieu Pham, Louahdi Khoudour, Alain Crouzil, Pablo Zegers, Sergio A. Velastin. Exploiting Deep Residual Networks for Human Action Recognition from Skeletal Data. Computer Vision and Image Understanding, 2018, ⟨10.1016/j.cviu.2018.03.003⟩. ⟨hal-02192228⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CEREMA

88 Consultations

0 Téléchargements

Exploiting Deep Residual Networks for Human Action Recognition from Skeletal Data

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager