Skip to Main content Skip to Navigation
Journal articles

Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks

Abstract : Human action recognition in video is one of the key problems in visual data interpretation. Despite intensive research, the recognition of actions with low inter-class variability remains a challenge. This paper presents a new Siamese Spatio-Temporal Convolutional Neural Network (SSTCNN) for this purpose. When applied to table tennis, it is possible to detect and recognize 20 table tennis strokes. The model has been trained on a specific dataset, so called TTStroke-21, recorded in natural conditions at the Faculty of Sports of the University of Bordeaux. Our model takes as inputs a RGB image sequence and its computed residual Optical Flow. The proposed siamese network architecture comprises 3 spatio-temporal convolutional layers, followed by a fully connected layer where data are fused. Our method reaches an accuracy of 91.4% against 43.1% for our baseline.
Complete list of metadata

Cited literature [31 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02551019
Contributor : Pierre-Etienne Martin Connect in order to contact the contributor
Submitted on : Tuesday, June 16, 2020 - 7:12:49 PM
Last modification on : Tuesday, January 4, 2022 - 6:17:05 AM

File

MTAP.pdf
Files produced by the author(s)

Identifiers

Citation

Pierre-Etienne Martin, Jenny Benois-Pineau, Renaud Péteri, Julien Morlier. Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks. Multimedia Tools and Applications, Springer Verlag, 2020, ⟨10.1007/s11042-020-08917-3⟩. ⟨hal-02551019⟩

Share

Metrics

Les métriques sont temporairement indisponibles