P-cnn: Posebased cnn features for action recognition, Proceedings of the IEEE international conference on computer vision, pp.3218-3226, 2015. ,
YOLO9000: Better, Faster, Stronger, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6517-6525, 2017. ,
DOI : 10.1109/CVPR.2017.690
URL : http://arxiv.org/pdf/1612.08242
Learning Deep Features for Discriminative Localization, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
DOI : 10.1109/CVPR.2016.319
URL : http://arxiv.org/pdf/1512.04150
Recipe recognition with large multimodal food dataset, Multimedia & Expo Workshops (ICMEW), 2015 IEEE International Conference on, pp.1-6, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01196959
Food recognition and recipe analysis: integrating visual content, context and external knowledge, 2018. ,
Deep-based Ingredient Recognition for Cooking Recipe Retrieval, Proceedings of the 2016 ACM on Multimedia Conference, MM '16, pp.32-41, 2016. ,
DOI : 10.1109/ICMEW.2015.7169816
Predicting the Structure of Cooking Recipes, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp.781-786, 2015. ,
DOI : 10.18653/v1/D15-1090
URL : https://doi.org/10.18653/v1/d15-1090
A method for extracting major workflow composed of ingredients, tools, and actions from cooking procedural text, 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp.1-6, 2016. ,
DOI : 10.1109/ICMEW.2016.7574705
Unsupervised visual-linguistic reference resolution in instructional videos, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. ,
Recognizing Fine-Grained and Composite Activities Using Hand-Centric Features and Script Data, International Journal of Computer Vision, vol.34, issue.9, pp.1-28 ,
DOI : 10.1109/ICCVW.2011.6130353
URL : http://arxiv.org/pdf/1502.06648
Combining embedded accelerometers with computer vision for recognizing food preparation activities, Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing, UbiComp '13, 2013. ,
DOI : 10.1145/2493432.2493482
URL : http://cvip.computing.dundee.ac.uk/papers/Stein2013UbiComp.pdf
The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014. ,
DOI : 10.1109/CVPR.2014.105
KUSK dataset, Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing Adjunct Publication, UbiComp '14 Adjunct, pp.583-588, 2014. ,
DOI : 10.1145/2638728.2641338
Towards automatic learning of procedures from web instructional videos. arXiv preprint, 2017. ,
Scaling egocentric vision: The epic-kitchens dataset. arXiv preprint, 2018. ,
ImageNet classification with deep convolutional neural networks, Advances in neural information processing systems, pp.1097-1105, 2012. ,
DOI : 10.1162/neco.2009.10-08-881
URL : http://dl.acm.org/ft_gateway.cfm?id=3065386&type=pdf
Is object localization for free? - Weakly-supervised learning with convolutional neural networks, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.685-694, 2015. ,
DOI : 10.1109/CVPR.2015.7298668
URL : https://hal.archives-ouvertes.fr/hal-01015140
Grad-cam: Visual explanations from deep networks via gradient-based localization, International Conference on Computer Vision, 2017. ,