D. Weinland, R. Ronfard, and E. Boyer, A Survey of Vision-based Methods for Action Representation, Segmentation and Recognition, vol.115, pp.224-241, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00640088

D. Lowe, Distinctive Image Features from Scale-invariant Keypoints, vol.60, pp.91-110, 2004.

I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, Learning Realistic Human Actions from Movies, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1-8, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00548659

P. Dollár, V. Rabaud, G. Cottrell, and S. Belongie, Behavior Recognition via Sparse Spatio-temporal Features, Proceedings of the IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS), pp.65-72, 2005.

M. Ye and R. Yang, Real-time Simultaneous Pose and Shape Estimation for Articulated Objects using a Single Depth Camera, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2345-2352, 2014.

J. Wang, Z. Liu, Y. Wu, and J. Yuan, Mining Actionlet Ensemble for Action Recognition with Depth Cameras, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1290-1297, 2012.

L. Xia, C. Chen, and J. K. Aggarwal, View-Invariant Human Action Recognition using Histograms of 3D Joints, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.20-27, 2012.

R. Chaudhry, F. Ofli, G. Kurillo, R. Bajcsy, and R. Vidal, Bio-inspired Dynamic 3D Discriminative Skeletal Features for Human Action Recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.471-478, 2013.

R. Vemulapalli, F. Arrate, and R. Chellappa, Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.588-595, 2014.

W. Ding, K. Liu, X. Fu, and F. Cheng, Profile HMMs for Skeleton-based Human Action Recognition. Signal Process, Image Commun, vol.42, pp.109-119, 2016.

Z. Zhang, Microsoft Kinect Sensor and Its Effect, IEEE Multimed, vol.19, pp.4-10, 2012.

Z. Cao, T. Simon, S. Wei, and Y. Sheikh, Realtime Multi-person 2D Pose Estimation using Part Affinity Fields, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.7291-7299, 2017.

H. S. Fang, S. Xie, Y. W. Tai, and C. Lu, RMPE: Regional Multi-person Pose Estimation, 2017.

H. Pham, M. Guan, B. Zoph, Q. Le, and J. Dean, Efficient Neural Architecture Search via Parameters Sharing, Proceedings of the International Conference on Machine Learning (ICML), pp.4095-4104, 2018.

G. Johansson, Visual Motion Perception, Sci. Am, vol.232, pp.76-89, 1975.

J. Gu, X. Ding, S. Wang, and Y. Wu, Action and Gait Recognition from Recovered 3D Human Joints, IEEE Trans. Syst. Man Cybern, vol.40, pp.1021-1033, 2010.

A. Newell, K. Yang, and J. Deng, Stacked Hourglass Networks for Human Pose Estimation, Proceedings of the European Conference on Computer Vision (ECCV), pp.8-16, 2016.

C. Ionescu, D. Papava, V. Olaru, C. Sminchisescu, and . Human3, 6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments, IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI), vol.36, pp.1325-1339, 2014.

W. Li, Z. Zhang, and Z. Liu, Action Recognition Based on a Bag of 3D Points, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.9-14, 2010.

K. Yun, J. Honorio, D. Chattopadhyay, T. L. Berg, and D. Samaras, Two-person Interaction Detection using Body-pose Features and Multiple Instance Learning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.28-35, 2012.

S. Nikolaos, B. Bogdan, I. Bogdan, and A. K. Ioannis, 3D Human Pose Estimation: A Review of the Literature and Analysis of Covariates, CVIU, vol.152, pp.1-20, 2016.

L. Presti and M. La-cascia, 3D Skeleton-based Human Action Classification: A Survey, Pattern Recognit, vol.53, pp.130-147, 2016.

C. Sminchisescu, 3D Human Motion Analysis in Monocular Video Techniques and Challenges, Proceedings of the IEEE International Conference on Video and Signal Based Surveillance (ICVSBS), p.76, 2006.

V. Ramakrishna, T. Kanade, and Y. Sheikh, Reconstructing 3D Human Pose from 2D Image Landmarks, Proceedings of the European Conference on Computer Vision (ECCV), pp.573-586, 2012.

S. Li and A. B. Chan, 3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network, Proceedings of the Asian Conference on Computer Vision (ACCV), pp.332-347, 2014.

B. Tekin, A. Rozantsev, V. Lepetit, and P. Fua, Direct Prediction of 3D Body Poses from Motion Compensated Sequences, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.991-1000, 2016.

G. Pavlakos, X. Zhou, K. G. Derpanis, and K. Daniilidis, Coarse-to-fine Volumetric Prediction for Single-image 3D Human Pose, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.7025-7034, 2017.

D. Pavllo, C. Feichtenhofer, D. Grangier, and M. Auli, 3D Human Pose Estimation in Video with Temporal Convolutions and Semi-supervised, Training, 2018.

D. Mehta, S. Sridhar, O. Sotnychenko, H. Rhodin, M. Shafiei et al., VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera, vol.36, p.44, 2017.

I. Katircioglu, B. Tekin, M. Salzmann, V. Lepetit, and P. Fua, Learning Latent Representations of 3D Human Pose with, Deep Neural Networks, vol.126, pp.1326-1341, 2018.
URL : https://hal.archives-ouvertes.fr/hal-02509358

Y. Fisher and K. Vladlen, Multi-scale Context Aggregation by Dilated Convolutions, 2015.

K. He, X. Zhang, S. Ren, and J. Sun, Deep Residual Learning for Image Recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.770-778, 2016.

H. Sepp and S. Jürgen, Long Short-Term Memory, Neural Comput, vol.9, pp.1735-1780, 1997.

J. Martinez, R. Hossain, J. Romero, and J. Little, A Simple Yet Effective Baseline for 3D Human Pose Estimation, Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp.2640-2649, 2017.

F. Lv and R. Nevatia, Recognition and Segmentation of 3D Human Action Using HMM and Multi-class AdaBoost, Proceedings of the European Conference on Computer Vision (ECCV), pp.359-372, 2006.

L. Han, X. Wu, W. Liang, G. Hou, and Y. Jia, Discriminative Human Action Recognition in the Learned Hierarchical Manifold Space, Image Vis. Comput, p.28, 2010.

J. Liu, A. Shahroudy, D. Xu, and G. Wang, Spatio-temporal LSTM with Trust Gates for 3D Human Action Recognition, Proceedings of the European Conference on Computer Vision (ECCV), pp.816-833, 2016.

Y. Du, W. Wang, and L. Wang, Hierarchical Recurrent Neural Network for Skeleton based Action Recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1110-1118, 2015.

A. Shahroudy, J. Liu, T. T. Ng, G. Wang, and . Ntu-rgb+-d, A Large Scale Dataset for 3D Human Activity Analysis, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1010-1019, 2016.

T. N. Sainath, O. Vinyals, A. Senior, and H. Sak, Long Short-Term Memory, Fully Connected Deep Neural Networks, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4580-4584, 2015.

G. Chéron, I. Laptev, C. Schmid, and . P-cnn, Pose-based CNN Features for Action Recognition, Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp.13-16, 2015.

B. Yao and L. Fei-fei, Modeling Mutual Context of Object and Human Pose in Human-object Interaction Activities, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.17-24, 2010.

B. X. Nie, C. Xiong, and S. Zhu, Joint Action Recognition and Pose Estimation from Video, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1293-1301, 2015.

D. C. Luvizon, D. Picard, and H. Tabia, 2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.5137-5146, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01815703

P. J. Huber, Robust Estimation of a Location Parameter, In Breakthroughs in Statistics, pp.492-518, 1992.

S. Christian, I. Sergey, and V. Vincent, Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp.12-17, 2016.

H. Gao, L. Zhuang, M. Laurens-van-der, and Q. W. Kilian, Densely Connected Convolutional Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2261-2269, 2017.

Z. Barret and V. L. Quoc, Neural Architecture Search with Reinforcement Learning. arXiv 2017

S. Ioffe and C. Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Proceedings of the International Conference on Machine Learning (ICML), pp.6-11, 2015.

G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Improving Neural Networks by Preventing Co-adaptation of Feature Detectors, 2012.

G. Klambauer, T. Unterthiner, A. Mayr, and S. Hochreiter, Self-Normalizing Neural Networks, Adv. Neural Inf. Process. Syst, pp.971-980

H. Pham, L. Khoudour, A. Crouzil, P. Zegers, and S. A. Velastin, Exploiting Deep Residual Networks for Human Action Recognition from Skeletal Data, vol.170, pp.51-66, 2018.
URL : https://hal.archives-ouvertes.fr/hal-02192228

H. Pham, L. Khoudour, A. Crouzil, P. Zegers, and S. A. Velastin, Skeletal Movement to Color Map: A Novel Representation for 3D Action Recognition with Inception Residual Networks, Proceedings of the IEEE International Conference on Image Processing (ICIP), pp.3483-3487, 2018.
URL : https://hal.archives-ouvertes.fr/hal-02193711

H. Pham, H. Salmane, L. Khoudour, A. Crouzil, P. Zegers et al., Spatio-Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks, Sensors, vol.19, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02192281

S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz et al., Adaptive Histogram Equalization and Its Variations. Comput. Vision, Graph. Image Process, vol.39, pp.355-368, 1987.

H. H. Pham, H. Salmane, L. Khoudour, A. Crouzil, P. Zegers et al., A Deep Learning Approach for Real-Time 3D Human Action Recognition from Skeletal Data, Proceedings of the International Conference on Image Analysis and Recognition, pp.18-32, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02883879

K. He, X. Zhang, S. Ren, and J. Sun, Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp.1026-1034, 2015.

D. Kingma, J. Ba, and . Adam, A Method for Stochastic Optimization. arXiv, 2014.

N. Yurii, A Method for Solving a Convex Programming Problem with Convergence Rate O(1/K2), Sov. Math. Dokl, pp.372-377, 1983.

L. Ilya, H. Frank, and . Sgdr, Stochastic Gradient Descent with Warm Restarts. arXiv 2016

Y. Du, Y. Wong, Y. Liu, F. Han, Y. Gui et al., Marker-less 3D Human Motion Capture with Monocular Image Sequence and Height-maps, Proceedings of the European Conference on Computer Vision (ECCV), vol.8, pp.20-36, 2016.

S. Park, J. Hwang, and N. Kwak, 3D Human Pose Estimation using Convolutional Neural Networks with 2D Pose Information, Proceedings of the European Conference on Computer Vision (ECCV), pp.156-169, 2016.

X. Zhou, M. Zhu, S. Leonardos, K. G. Derpanis, and K. Daniilidis, Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4966-4975, 2016.

Z. Xingyi, S. Xiao, Z. Wei, L. Shuang, and W. Yichen, Deep Kinematic Pose Regression, Proceedings of the European Conference on Computer Vision (ECCV), pp.8-16, 2016.

D. Mehta, H. Rhodin, D. Casas, P. Fua, O. Sotnychenko et al., Monocular 3D Human Pose Estimation in the Wild using Improved CNN Supervision, Proceedings of the International Conference on 3D Vision (3DV), pp.506-516, 2017.

L. Shuang, S. Xiao, and W. Yichen, Compositional Human Pose Regression, Comput. Vis. Image Underst, pp.1-8, 2018.

C. Chen, K. Liu, and N. Kehtarnavaz, Real-time Human Action Recognition based on Depth Motion Maps. J. -Real-Time Image Process, vol.12, 2016.

P. Wang, C. Yuan, W. Hu, B. Li, and Y. Zhang, Graph Based Skeleton Motion Representation and Similarity Measurement for Action Recognition, Proceedings of the British Machine Vision Conference (BMVC), pp.19-22, 2016.

J. Weng, C. Weng, and J. Yuan, Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.22-26, 2017.

H. Xu, E. Chen, C. Liang, L. Qi, and L. Guan, Spatio-temporal Pyramid Model based on Depth Maps for Action Recognition, Proceedings of the IEEE International Workshop on Multimedia Signal Processing (MMSP), pp.1-6, 2015.

I. Lee, D. Kim, S. Kang, and S. Lee, Ensemble Deep Learning for Skeleton-based Action Recognition using Temporal Sliding LSTM Networks, Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp.1012-1020, 2017.

S. Song, C. Lan, J. Xing, W. Zeng, and J. Liu, An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp.4-9, 2017.

J. Weng, C. Weng, J. Yuan, and Z. Liu, Discriminative Spatio-Temporal Pattern Discovery for 3D Action Recognition, IEEE Trans. Circuits Syst. Video Technol. (TCCVT), vol.29, pp.1077-1089, 2019.

Q. Ke, M. Bennamoun, S. An, F. Sohel, and F. Boussaid, A New Representation of Skeleton Sequences for 3D Action Recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4570-4579, 2017.

T. Yusuf and K. Piotr, CNN-based Action Recognition and Supervised Domain Adaptation on 3D Body Skeletons via Kernel Feature Maps, Proceedings of the British Machine Vision Conference (BMVC), p.158, 2018.

H. Wang and L. Wang, Modeling Temporal Dynamics and Spatial Configurations of Actions Using Two-Stream Recurrent Neural Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3633-3642, 2017.

J. Liu, G. Wang, L. Duan, K. Abdiyeva, and A. C. Kot, Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks, IEEE Trans. Image Process. (TIP, vol.27, pp.1586-1599, 2018.

P. Zhang, C. Lan, J. Xing, W. Zeng, J. Xue et al., View Adaptive Neural Networks for High Performance Skeleton-based Human Action Recognition, IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI), vol.2019, pp.1963-1978