B. Amor, J. Su, and A. Srivastava, Action Recognition Using Rate-Invariant Analysis of Skeletal Shape Trajectories, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, issue.1, pp.1-13, 2016.
DOI : 10.1109/TPAMI.2015.2439257

P. Bilinski, M. Koperski, S. Bak, and F. Bremond, Representing visual appearance by video Brownian covariance descriptor for human action recognition, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2014.
DOI : 10.1109/AVSS.2014.6918649

URL : https://hal.archives-ouvertes.fr/hal-01054943

T. Brox, A. Bruhn, N. Papenberg, and J. Weickert, High Accuracy Optical Flow Estimation Based on a Theory for Warping, ECCV, 2004.
DOI : 10.1007/978-3-540-24673-2_3

URL : http://www.mia.uni-saarland.de/brox/OpticFlowWarping.pdf

Z. Cao, T. Simon, S. Wei, and Y. Sheikh, Realtime multiperson 2d pose estimation using part affinity fields, 2016.
DOI : 10.1109/cvpr.2017.143

G. Chron, I. Laptev, and C. Schmid, Ap-cnn: Pose-based cnn features for action recognition, ICCV, 2015.

G. Gkioxari, B. Hariharan, R. Girshick, and J. Malik, Using k-Poselets for Detecting People and Localizing Their Keypoints, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.458

URL : http://www.cs.berkeley.edu/~bharath2/pubs/pdfs/GeorgiaBharathCVPR2014b.pdf

G. Gkloxari and J. Malik, Finding action tubes, CVPR, 2015.

J. Hu, W. Zheng, J. Lai, and J. Zhang, Jointly learning heterogeneous features for RGB-D activity recognition, CVPR, 2015.
DOI : 10.1109/cvpr.2015.7299172

A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar et al., Large-Scale Video Classification with Convolutional Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.223

URL : http://www.cs.cmu.edu/~rahuls/pub/cvpr2014-deepvideo-rahuls.pdf

Y. Kong and Y. Fu, Bilinear heterogeneous information machine for RGB-D action recognition, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7298708

M. Koperski, P. Bilinski, and F. Bremond, 3D trajectories for action recognition, 2014 IEEE International Conference on Image Processing (ICIP), 2014.
DOI : 10.1109/ICIP.2014.7025848

URL : https://hal.archives-ouvertes.fr/hal-01054949

M. Koperski and F. Bremond, Modeling spatial layout of features for real world scenario RGB-D action recognition, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2016.
DOI : 10.1109/AVSS.2016.7738023

URL : https://hal.archives-ouvertes.fr/hal-01399037

H. S. Koppula, R. Gupta, and A. Saxena, Learning human activities and object affordances from RGB-D videos, The International Journal of Robotics Research, vol.29, issue.3, pp.951-970, 2013.
DOI : 10.1177/0278364909356602

URL : http://journals.sagepub.com/doi/pdf/10.1177/0278364913478446

A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Communications of the ACM, vol.60, issue.6, 2012.
DOI : 10.1162/neco.2009.10-08-881

URL : http://dl.acm.org/ft_gateway.cfm?id=3065386&type=pdf

I. Laptev and T. Lindeberg, Space-time interest points, ICCV, 2003.
DOI : 10.1109/iccv.2003.1238378

I. Laptev, M. Marszaek, C. Schmid, and B. Rozenfeld, Learning realistic human actions from movies, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587756

URL : https://hal.archives-ouvertes.fr/inria-00548659

L. Liu and L. Shao, Learning discriminative representations from rgb-d video data, IJCAI, 2013.

B. Mahasseni and S. Todorovic, Regularizing lstm with 3d human-skeleton sequences for action recognition A decision forest based feature selection framework for action recognition from rgb-depth cameras, CVPR, 2016. ICIAR, 2013.

O. Oreifej and Z. Liu, HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.98

]. L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. Andriluka et al., DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.533

URL : http://arxiv.org/pdf/1511.06645

L. Pishchulin, A. Jain, M. Andriluka, T. Thormahlen, and B. Schiele, Articulated people detection and pose estimation:reshaping the future, CVPR, 2012.
DOI : 10.1109/cvpr.2012.6248052

URL : http://www.informatik.uni-marburg.de/~thormae/paper/CVPR12.pdf

C. Schuldt, I. Laptev, and B. Caputo, Recognizing human actions: a local SVM approach, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., 2004.
DOI : 10.1109/ICPR.2004.1334462

URL : http://www.nada.kth.se/%7Ecaputo/publik/icpr04actions.pdf

L. Seidenari, V. Varano, S. Berretti, A. D. Bimbo, and P. Pala, Recognizing Actions from Depth Cameras as Weakly Aligned Multi-part Bag-of-Poses, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2013.
DOI : 10.1109/CVPRW.2013.77

URL : http://www.micc.unifi.it/publications/2013/SVBPD13/PID2770689.pdf

A. Shahroudy, G. Wang, and T. Ng, Multi-modal feature fusion for action recognition in RGB-D sequences, 2014 6th International Symposium on Communications, Control and Signal Processing (ISCCSP), 2014.
DOI : 10.1109/ISCCSP.2014.6877819

J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio et al., Real-time human pose recognition in parts from single depth images, CVPR, 2011.
DOI : 10.1007/978-3-642-28661-2_5

K. Simonyan and A. Zisserman, Two-stream convolutional networks for action recognition in videos, NIPS, 2014.

M. Sun and S. Savarese, Articulated part-based model for joint object detection and pose estimation, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126309

J. Sung, C. Ponce, B. Selman, and A. Saxena, Unstructured human activity detection from rgbd images, 2012.

R. Vemulapalli and R. Chellappa, Rolling rotations for recognizing human actions from 3dskeletal data, CVPR, 2016.
DOI : 10.1109/cvpr.2016.484

H. Wang and C. Schmid, Action Recognition with Improved Trajectories, 2013 IEEE International Conference on Computer Vision, 2013.
DOI : 10.1109/ICCV.2013.441

URL : https://hal.archives-ouvertes.fr/hal-00873267

L. Wang, Y. Qiao, and X. Tang, Action recognition with trajectory-pooled deep-convolutional descriptors, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7299059

URL : http://wanglimin.github.io/papers/WangQT_CVPR15.pdf

D. Wu and L. Shao, Leveraging hierarchial parametric networks for skeletal joints based action segmentation and recognition, CVPR, 2014.
DOI : 10.1109/cvpr.2014.98

URL : http://lshao.staff.shef.ac.uk/pub/DBN_HMM_CVPR2014.pdf

Y. Wu, Mining actionlet ensemble for action recognition with depth cameras, CVPR, 2012.

L. Xia and J. Aggarwal, Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.365

URL : http://cvrc.ece.utexas.edu/lu/CVPR2013_Lu_20130918.pdf

X. Yang and Y. Tian, Super Normal Vector for Activity Recognition Using Depth Sequences, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.108

URL : http://yangxd.org/publications/papers/SNV.pdf

J. Yue-hei, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga et al., Beyond short snippets: Deep networks for video classification, CVPR, 2015.

Y. Zhu, W. Chen, and G. Guo, Evaluating spatiotemporal interest point features for depth-based action recognition, Image and Vision Computing, vol.32, issue.8, pp.453-464, 2014.
DOI : 10.1016/j.imavis.2014.04.005