M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, 2D Human Pose Estimation: New Benchmark and State of the Art Analysis, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.471

URL : http://ps.is.tue.mpg.de/publications/168/get_file/

M. Andriluka, S. Roth, and B. Schiele, Pictorial structures revisited: People detection and articulated pose estimation, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.1014-1021, 2009.
DOI : 10.1109/CVPR.2009.5206754

URL : http://www.gris.informatik.tu-darmstadt.de/~sroth/pubs/cvpr09andriluka.pdf

F. Baradel, C. Wolf, and J. Mille, Pose-conditioned spatiotemporal attention for human action recognition. arxiv, 1703.
URL : https://hal.archives-ouvertes.fr/hal-01593548

F. Baradel, C. Wolf, J. Mille, and G. W. Taylor, Glimpse clouds: Human activity recognition from unstructured feature points, Computer Vision and Pattern Recognition (CVPR), 2018.
URL : https://hal.archives-ouvertes.fr/hal-01713109

V. Belagiannis, C. Rupprecht, G. Carneiro, and N. Navab, Robust Optimization for Deep Regression, 2015 IEEE International Conference on Computer Vision (ICCV), pp.2830-2838, 2015.
DOI : 10.1109/ICCV.2015.324

URL : http://arxiv.org/pdf/1505.06606

V. Belagiannis and A. Zisserman, Recurrent human pose estimation. CoRR, abs, 1605.
DOI : 10.1109/fg.2017.64

URL : http://arxiv.org/pdf/1605.02914

A. Bulat and G. Tzimiropoulos, Human Pose Estimation via Convolutional Part Heatmap Regression, European Conference on Computer Vision (ECCV), pp.717-732, 2016.
DOI : 10.1109/CVPR.2016.335

URL : http://arxiv.org/pdf/1609.01743

C. Cao, Y. Zhang, C. Zhang, and H. Lu, Body joint guided 3d deep convolutional descriptors for action recognition, 1704.
DOI : 10.1109/tcyb.2017.2756840

URL : http://arxiv.org/pdf/1704.07160

J. Carreira, P. Agrawal, K. Fragkiadaki, and J. Malik, Human Pose Estimation with Iterative Error Feedback, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4733-4742, 2016.
DOI : 10.1109/CVPR.2016.512

URL : http://arxiv.org/pdf/1507.06550

J. Carreira and A. Zisserman, Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.502

URL : http://arxiv.org/pdf/1705.07750

C. Chen and D. Ramanan, 3D Human Pose Estimation = 2D Pose Estimation + Matching, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.610

URL : http://arxiv.org/pdf/1612.06524

Y. Chen, C. Shen, X. Wei, L. Liu, and J. Yang, Adversarial PoseNet: A Structure-Aware Convolutional Network for Human Pose Estimation, 2017 IEEE International Conference on Computer Vision (ICCV), 1705.
DOI : 10.1109/ICCV.2017.137

URL : http://arxiv.org/pdf/1705.00389

Y. Chen, C. Shen, X. Wei, L. Liu, and J. Yang, Adversarial PoseNet: A Structure-Aware Convolutional Network for Human Pose Estimation, 2017 IEEE International Conference on Computer Vision (ICCV), 2017.
DOI : 10.1109/ICCV.2017.137

URL : http://arxiv.org/pdf/1705.00389

G. Ch-'eron, I. Laptev, and C. Schmid, P-CNN: Pose-based CNN Features for Action Recognition, ICCV, 2015.

C. Chou, J. Chien, and H. Chen, Self adversarial training for human pose estimation, 1707.

X. Chu, W. Yang, W. Ouyang, C. Ma, A. L. Yuille et al., Multi-context Attention for Human Pose Estimation, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.601

URL : http://arxiv.org/pdf/1702.07432

M. Dantone, J. Gall, C. Leistner, and L. V. , Human Pose Estimation Using Body Parts Dependent Joint Regressors, 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.3041-3048, 2013.
DOI : 10.1109/CVPR.2013.391

URL : https://lirias.kuleuven.be/bitstream/123456789/398648/2/3601_open+access.pdf

]. G. Gkioxari, A. Toshev, and N. Jaitly, Chained Predictions Using Convolutional Neural Networks. European Conference on Computer Vision (ECCV), 2016.

S. Herath, M. Harandi, and F. Porikli, Going deeper into action recognition: A survey, Regularization Techniques for High-Dimensional Data Analysis, pp.4-21, 2017.
DOI : 10.1016/j.imavis.2017.01.010

E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele, DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model, European Conference on Computer Vision (ECCV), 2016.
DOI : 10.1109/CVPR.2014.308

C. Ionescu, D. Papava, V. Olaru, and C. Sminchisescu, Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, issue.7, pp.1325-1339, 2014.
DOI : 10.1109/TPAMI.2013.248

U. Iqbal, M. Garbade, and J. Gall, Pose for Action - Action for Pose, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), 2017.
DOI : 10.1109/FG.2017.61

H. Jhuang, J. Gall, S. Zuffi, C. Schmid, and M. J. Black, Towards Understanding Action Recognition, 2013 IEEE International Conference on Computer Vision, 2013.
DOI : 10.1109/ICCV.2013.396

URL : https://hal.archives-ouvertes.fr/hal-00906902

I. Kokkinos, UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.2017
DOI : 10.1109/CVPR.2017.579

I. Lifshitz, E. Fetaya, and S. Ullman, Human Pose Estimation Using Deep Consensus Voting, pp.246-260
DOI : 10.1109/CVPR.2011.5995741

J. Liu, A. Shahroudy, D. Xu, and G. Wang, Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition, ECCV, pp.816-833, 2016.
DOI : 10.1109/ISSNIP.2014.6827664

J. Liu, G. Wang, P. Hu, L. Duan, and A. C. Kot, Global Context-Aware Attention LSTM Networks for 3D Action Recognition, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.391

D. C. Luvizon, H. Tabia, and D. Picard, Human pose regression by combining indirect part detection and contextual information, 1710.

D. C. Luvizon, H. Tabia, and D. Picard, Learning features combination for human action recognition from skeleton sequences, Pattern Recognition Letters, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01515376

J. Martinez, R. Hossain, J. Romero, and J. J. Little, A Simple Yet Effective Baseline for 3d Human Pose Estimation, 2017 IEEE International Conference on Computer Vision (ICCV), 2017.
DOI : 10.1109/ICCV.2017.288

D. Mehta, H. Rhodin, D. Casas, O. Sotnychenko, W. Xu et al., Monocular 3d human pose estimation using transfer learning and improved CNN supervision, 2016.

D. Mehta, S. Sridhar, O. Sotnychenko, H. Rhodin, M. Shafiei et al., VNect, ACM Transactions on Graphics, vol.36, issue.4, 2017.
DOI : 10.1145/2601097.2601165

A. Newell, K. Yang, and J. Deng, Stacked Hourglass Networks for Human Pose Estimation. European Conference on Computer Vision (ECCV), pp.483-499, 2016.

G. Ning, Z. Zhang, and Z. He, Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation, IEEE Transactions on Multimedia, vol.20, issue.5, pp.1-1, 2017.
DOI : 10.1109/TMM.2017.2762010

G. Pavlakos, X. Zhou, K. G. Derpanis, and K. Daniilidis, Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.139

T. Pfister, K. Simonyan, J. Charles, and A. Zisserman, Deep Convolutional Neural Networks for Efficient Pose Estimation in Gesture Videos, Asian Conference on Computer Vision (ACCV), 2014.
DOI : 10.1007/978-3-319-16865-4_35

L. Pishchulin, M. Andriluka, P. Gehler, and B. Schiele, Poselet Conditioned Pictorial Structures, 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.588-595, 2013.
DOI : 10.1109/CVPR.2013.82

URL : http://www.cv-foundation.org/openaccess/content_cvpr_2013/papers/Pishchulin_Poselet_Conditioned_Pictorial_2013_CVPR_paper.pdf

L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. Andriluka et al., DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.533

URL : http://arxiv.org/pdf/1511.06645

A. Popa, M. Zanfir, and C. Sminchisescu, Deep Multitask Architecture for Integrated 2D and 3D Human Sensing, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.501

URL : http://arxiv.org/pdf/1701.08985

L. L. Presti and M. L. Cascia, 3D skeleton-based human action classification: A survey, Pattern Recognition, vol.53, pp.130-147, 2016.
DOI : 10.1016/j.patcog.2015.11.019

U. Rafi, I. Kostrikov, J. Gall, and B. Leibe, An Efficient Convolutional Network for Human Pose Estimation, Procedings of the British Machine Vision Conference 2016, 2016.
DOI : 10.5244/C.30.109

URL : http://www.bmva.org/bmvc/2016/papers/paper109/abstract109.pdf

G. Rogez, P. Weinzaepfel, and C. Schmid, LCR-Net: Localization-Classification-Regression for Human Pose, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.134

URL : https://hal.archives-ouvertes.fr/hal-01505085

N. Sarafianos, B. Boteanu, B. Ionescu, and I. A. , 3D Human pose estimation: A review of the literature and analysis of covariates, Computer Vision and Image Understanding, vol.152, pp.1-20, 2016.
DOI : 10.1016/j.cviu.2016.09.002

A. Shahroudy, J. Liu, T. Ng, and G. Wang, NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.115

URL : http://arxiv.org/pdf/1604.02808

A. Shahroudy, T. Ng, Y. Gong, and G. Wang, Deep multimodal feature analysis for action recognition in rgb+d videos. TPAMI, 2017.
DOI : 10.1109/tpami.2017.2691321

URL : http://arxiv.org/pdf/1603.07120

S. Song, C. Lan, J. Xing, W. Z. , and J. Liu, An end-to-end spatio-temporal attention model for human action recognition from skeleton data, AAAI Conference on Artificial Intelligence, 2017.

X. Sun, J. Shang, S. Liang, and Y. Wei, Compositional Human Pose Regression, 2017 IEEE International Conference on Computer Vision (ICCV), 2017.
DOI : 10.1109/ICCV.2017.284

URL : http://arxiv.org/pdf/1704.00159

C. Szegedy, S. Ioffe, and V. Vanhoucke, Inception-v4, inception-resnet and the impact of residual connections on learning, 1602.

B. Tekin, P. Márquez-neila, M. Salzmann, P. Fua, C. Russell et al., Fusing 2d uncertainty and 3d cues for monocular body pose estimation Lifting from the deep: Convolutional 3d pose estimation from a single image, CVPR, 2017.

J. Tompson, R. Goroshin, A. Jain, Y. Lecun, and C. Bregler, Efficient object localization using Convolutional Networks, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.648-656, 2015.
DOI : 10.1109/CVPR.2015.7298664

URL : http://arxiv.org/pdf/1411.4280

A. Toshev and C. Szegedy, DeepPose: Human Pose Estimation via Deep Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.1653-1660, 2014.
DOI : 10.1109/CVPR.2014.214

URL : http://arxiv.org/pdf/1312.4659

G. Varol, I. Laptev, and C. Schmid, Long-term Temporal Convolutions for Action Recognition. TPAMI, 2017.
DOI : 10.1109/tpami.2017.2712608

URL : https://hal.archives-ouvertes.fr/hal-01241518

S. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh, Convolutional Pose Machines, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.511

URL : http://arxiv.org/pdf/1602.00134

B. Xiaohan-nie, C. Xiong, and S. Zhu, Joint action recognition and pose estimation from video, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

W. Yang, S. Li, W. Ouyang, H. Li, and X. Wang, Learning Feature Pyramids for Human Pose Estimation, 2017 IEEE International Conference on Computer Vision (ICCV), 2017.
DOI : 10.1109/ICCV.2017.144

URL : http://arxiv.org/pdf/1708.01101

A. Yao, J. Gall, and L. Van-gool, Coupled Action Recognition and Pose Estimation from Multiple Views, International Journal of Computer Vision, vol.73, issue.2, pp.16-37, 2012.
DOI : 10.1006/cviu.1998.0726

URL : http://www.vision.ee.ethz.ch/%7Eyaoa/pdfs/yao_ijcv2012.pdf

K. M. Yi, E. Trulls, V. Lepetit, and P. Fua, LIFT: Learned Invariant Feature Transform. European Conference on Computer Vision (ECCV), 2016.

W. Zhang, M. Zhu, and K. G. Derpanis, From Actemes to Action: A Strongly-Supervised Representation for Detailed Action Understanding, 2013 IEEE International Conference on Computer Vision, pp.2248-2255, 2013.
DOI : 10.1109/ICCV.2013.280

X. Zhou, M. Zhu, G. Pavlakos, S. Leonardos, K. G. Derpanis et al., MonoCap: Monocular Human Motion Capture using a CNN Coupled with a Geometric Prior, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1701.
DOI : 10.1109/TPAMI.2018.2816031