S. Agarwal, Y. Furukawa, N. Snavely, B. Curless, S. M. Seitz et al., IEEE Computer, pp.40-47, 2010.

S. Agarwal, N. Snavely, I. Simon, S. M. Seitz, and R. Szeliski, Building rome in a day, IEEE International Conference on Computer Vision, pp.72-79, 2009.

S. Branson, J. D. Wegner, D. Hall, N. Lang, K. Schindler et al., From google maps to a fine-grained catalog of street trees, ISPRS Journal of Photogrammetry and Remote Sensing, vol.135, pp.13-30, 2018.

J. Bromley, I. Guyon, Y. Lecun, E. Säckinger, and R. Shah, Signature verification using a "siamese" time delay neural network, Advances in Neural Information Processing Systems, pp.737-744, 1994.

X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, Multi-view 3d object detection network for autonomous driving, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1907-1915, 2017.

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler et al., The cityscapes dataset for semantic urban scene understanding, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.3213-3223, 2016.

S. En, A. Lechervy, and F. Jurie, Rpnet: An end-to-end network for relative camera pose estimation, European Conference on Computer Vision, pp.738-745, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01879117

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, Vision meets robotics: The kitti dataset, The International Journal of Robotics Research, vol.32, issue.11, pp.1231-1237, 2013.

X. Han, T. Leung, Y. Jia, R. Sukthankar, and A. C. Berg, Matchnet: Unifying feature and metric learning for patchbased matching, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3279-3286, 2015.

K. He, G. Gkioxari, P. Dollár, and R. Girshick, Mask R-CNN, IEEE International Conference on Computer Vision, pp.2980-2988, 2017.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770-778, 2016.

J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara et al., Speed/accuracy trade-offs for modern convolutional object detectors, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.7310-7311, 2017.

A. Kendall, M. Grimes, and R. Cipolla, Posenet: A convolutional network for real-time 6-dof camera relocalization, Proceedings of the IEEE international conference on computer vision, pp.2938-2946, 2015.

V. Krylov, E. Kenny, and R. Dahyot, Automatic discovery and geotagging of objects from street view imagery, Remote Sensing, vol.10, issue.5, p.661, 2018.

V. A. Krylov and R. Dahyot, Object geolocation using mrf based multi-sensor fusion, 25th IEEE International Conference on Image Processing (ICIP), pp.2745-2749, 2018.

J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. L. Waslander, Joint 3d proposal generation and object detection from view aggregation, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.1-8, 2018.

S. Lefèvre, D. Tuia, J. D. Wegner, T. Produit, and A. S. , Nassar. Toward seamless multiview scene analysis from satellite to street level, Proceedings of the IEEE, vol.105, issue.10, pp.1884-1899, 2017.

W. Li, R. Zhao, T. Xiao, and X. Wang, Deepreid: Deep filter pairing neural network for person re-identification, IEEE Conference on Computer Vision and Pattern Recognition, pp.152-159, 2014.

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed et al., Ssd: Single shot multibox detector, European conference on computer vision, pp.21-37, 2016.

D. C. Luvizon, D. Picard, and H. Tabia, 2d/3d pose estimation and action recognition using multitask deep learning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.5137-5146, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01815703

Y. Nakajima and H. Saito, Robust camera pose estimation by viewpoint classification using deep learning, Computational Vision Media, vol.3, issue.2, pp.189-198, 2017.

G. Neuhold, T. Ollmann, S. R. Bulò, and P. Kontschieder, The mapillary vistas dataset for semantic understanding of street scenes, ICCV, pp.5000-5009, 2017.

S. Nilwong, D. Hossain, S. Kaneko, and G. Capi, Outdoor landmark detection for real-world localization using faster r-cnn, 6th International Conference on Control, Mechatronics and Automation, pp.165-169, 2018.

G. Poier, D. Schinagl, and H. Bischof, Learning pose specific representations by predicting different views, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.60-69, 2018.

S. Ren, K. He, R. Girshick, and J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, Advances in neural information processing systems, pp.91-99, 2015.

F. Schroff, D. Kalenichenko, and J. Philbin, Facenet: A unified embedding for face recognition and clustering, IEEE Conference on Computer Vision and Pattern Recognition, pp.815-823, 2015.

E. Shechtman and M. Irani, Matching local self-similarities across images and videos, Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, pp.1-8, 2007.

S. Sun, R. Sarukkai, J. Kwok, and V. Shet, Accurate deep direct geo-localization from ground imagery and phone-grade gps, IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.1016-1023, 2018.

R. Tao, E. Gavves, and A. W. Smeulders, Siamese instance search for tracking, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.1420-1429, 2016.

J. D. Wegner, S. Branson, D. Hall, K. Schindler, and P. Perona, Cataloging public objects using aerial and street-level images -urban trees, IEEE Conference on Computer Vision and Pattern Recognition, pp.6014-6023, 2016.

Y. Xiang, T. Schmidt, V. Narayanan, and D. Fox, Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes, Robotics: Science and Systems, 2018.

J. Xiao, Y. Xie, T. Tillo, K. Huang, Y. Wei et al., Ian: the individual aggregation network for person search, Pattern Recognition, vol.87, pp.332-340, 2019.

T. Xiao, S. Li, B. Wang, L. Lin, and X. Wang, Joint detection and identification feature learning for person search, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3415-3424, 2017.

C. Yang, J. Huang, and M. Yang, Exploiting selfsimilarities for single frame super-resolution, Asian conference on computer vision, pp.497-510, 2010.

J. Zbontar and Y. Lecun, Stereo matching by training a convolutional neural network to compare image patches, Journal of Machine Learning Research, vol.17, issue.2, pp.1-32, 2016.

W. Zhang, C. Witharana, W. Li, C. Zhang, X. Li et al., Using deep learning to identify utility poles with crossarms and estimate their locations from google street view images, Sensors, vol.18, issue.8, p.2484, 2018.

J. Zhao, X. N. Zhang, H. Gao, J. Yin, M. Zhou et al., Object detection based on hierarchical multi-view proposal network for autonomous driving, 2018 International Joint Conference on Neural Networks (IJCNN), pp.1-6, 2018.

Q. Zheng, W. Wang, and W. Gao, Effective and efficient object-based image retrieval using visual phrases, Proceedings of the 14th ACM international conference on Multimedia, pp.77-80, 2006.

X. Zhou, K. Yu, T. Zhang, and T. S. Huang, Image classification using super-vector coding of local image descriptors, European conference on computer vision, pp.141-154, 2010.