R. Arandjelovi´carandjelovi´c, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, NetVLAD: CNN architecture for weakly supervised place recognition, p.CVPR, 2016.

A. Bagdanov and M. Worring, Fine-grained document genre classification using first order random graphs, Proceedings of Sixth International Conference on Document Analysis and Recognition, pp.79-83, 2001.
DOI : 10.1109/ICDAR.2001.953759

K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, Return of the Devil in the Details: Delving Deep into Convolutional Nets, Proceedings of the British Machine Vision Conference 2014, 2014.
DOI : 10.5244/C.28.6

S. Chen, Y. He, J. Sun, and S. Naoi, Structured document classification by matching local salient features, IEEE, pp.653-656, 2012.

M. Cimpoi, S. Maji, and A. Vedaldi, Deep filter banks for texture recognition and segmentation, Proceedings of the IEEE CVPR, pp.3828-3836, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01263622

G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, In: Int. Work. on Stat. Learning in Comp. Vision, 2004.

J. Deng, W. Dong, R. Socher, L. J. Li, K. Li et al., Imagenet: A large-scale hierarchical image database, IEEE, 2009.

V. Eglin and S. Bres, Document page similarity based on layout visual saliency: application to query by example and document classification, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings., p.ICDAR, 2003.
DOI : 10.1109/ICDAR.2003.1227849

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, A. Zisserman et al., The PASCAL Visual Object Classes Challenge 2012 Results 10 Object detection with discriminatively trained part-based models, Trans. PAMI, vol.32, issue.9, 2010.

Y. Gong, L. Wang, R. Guo, and S. Lazebnik, Multi-scale Orderless Pooling of Deep Convolutional Activation Features, p.ECCV, 2014.
DOI : 10.1007/978-3-319-10584-0_26

L. P. De-las-heras, O. R. Terrades, J. Llados, D. Fernandez-mota, and C. Canero, Use case visual Bag-of-Words techniques for camera based identity document classification, 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp.721-725, 2015.
DOI : 10.1109/ICDAR.2015.7333856

L. Herranz, S. Jiang, and X. Li, Scene Recognition with CNNs: Objects, Scales and Dataset Bias, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.571-579, 2016.
DOI : 10.1109/CVPR.2016.68

H. Jégou, F. Perronnin, M. Douze, and C. Schmid, Aggregating Local Image Descriptors into Compact Codes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.9, 2012.
DOI : 10.1109/TPAMI.2011.235

A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Communications of the ACM, vol.60, issue.6, pp.1097-1105, 2012.
DOI : 10.1162/neco.2009.10-08-881

J. Kumar and D. Doermann, Unsupervised Classification of Structurally Similar Document Images, 2013 12th International Conference on Document Analysis and Recognition, pp.1225-1229, 2013.
DOI : 10.1109/ICDAR.2013.248

S. Lai, L. Xu, K. Liu, and J. Zhao, Recurrent convolutional neural networks for text classification, In: AAAI, vol.333, pp.2267-2273, 2015.

S. Lazebnik, C. Schmid, and J. Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), p.CVPR, 2006.
DOI : 10.1109/CVPR.2006.68

URL : https://hal.archives-ouvertes.fr/inria-00548585

L. Liu, C. Shen, L. Wang, A. Van-den-hengel, and C. Wang, Encoding high dimensional local features by sparse coding based fisher vectors, p.NIPS, 2014.

J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3431-3440, 2015.
DOI : 10.1109/CVPR.2015.7298965

D. Lowe, Object recognition from local scale-invariant features, Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999.
DOI : 10.1109/ICCV.1999.790410

M. Oquab, L. Bottou, I. Laptev, and J. Sivic, Learning and transferring midlevel image representations using convolutional neural networks, CVPR, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00911179

F. Perronnin, J. Sánchez, and T. Mensink, Improving the Fisher Kernel for Large-Scale Image Classification, 2010.
DOI : 10.1007/978-3-642-15561-1_11

URL : https://hal.archives-ouvertes.fr/inria-00548630

C. Shin and D. Doermann, Document image retrieval based on layout structural similarity, pp.606-612, 2006.

R. Sicre, Y. Avrithis, E. Kijak, and F. Jurie, Unsupervised Part Learning for Visual Recognition, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.CVPR, 2017.
DOI : 10.1109/CVPR.2017.332

URL : https://hal.archives-ouvertes.fr/hal-01507379

R. Sicre and T. Gevers, DENSE sampling of features for image retrieval, 2014 IEEE International Conference on Image Processing (ICIP), pp.3057-3061, 2014.
DOI : 10.1109/ICIP.2014.7025618

R. Sicre and H. Jégou, Memory Vectors for Particular Object Retrieval with Multiple Queries, Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ICMR '15, pp.479-482, 2015.
DOI : 10.1145/1178677.1178722

R. Sicre and F. Jurie, Discriminative part model for visual recognition, Computer Vision and Image Understanding, vol.141, pp.28-37, 2015.
DOI : 10.1016/j.cviu.2015.08.002

URL : https://hal.archives-ouvertes.fr/hal-01132389

R. Sicre, H. E. Tasli, and T. Gevers, Superpixel based angular differences as a midlevel image descriptor, IEEE, pp.3732-3737, 2014.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, ICLR, 2015.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed et al., Going deeper with convolutions, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.CVPR, 2015.
DOI : 10.1109/CVPR.2015.7298594

H. E. Tasli, R. Sicre, T. Gevers, and A. A. Alatan, Geometry-constrained spatial pyramid adaptation for image classification, 2014 IEEE International Conference on Image Processing (ICIP), p.ICIP, 2014.
DOI : 10.1109/ICIP.2014.7025209

G. Tolias, R. Sicre, and H. Jégou, Particular object retrieval with integral maxpooling of cnn activations, ICLR, 2016.

C. Xing, D. Wang, X. Zhang, and C. Liu, Document classification with distributions of word vectors, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific
DOI : 10.1109/APSIPA.2014.7041633

B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, Learning Deep Features for Scene Recognition using Places Database, p.NIPS, 2014.