D. A. Clevert, T. Unterthiner, and S. Hochreiter, Fast and accurate deep network learning by exponential linear units (elus), 2015.

A. Dosovitskiy, P. Fischer, E. Ilg, P. Häusser, C. Haz?rba¸shaz?rba¸s et al., Flownet: Learning optical flow with convolutional networks, IEEE International Conference on Computer Vision (ICCV), 2015.

D. Eigen, C. Puhrsch, and R. Fergus, Depth map prediction from a single image using a multi-scale deep network, Advances in neural information processing systems, pp.2366-2374, 2014.

R. Garg, V. K. Bg, G. Carneiro, and I. Reid, Unsupervised cnn for single view depth estimation: Geometry to the rescue, European Conference on Computer Vision, pp.740-756, 2016.

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, Vision meets robotics: The kitti dataset, The International Journal of Robotics Research, vol.32, issue.11, pp.1231-1237, 2013.

C. Godard, O. Mac-aodha, and G. J. Brostow, Unsupervised monocular depth estimation with left-right consistency, 2017.

M. Jaderberg, K. Simonyan, and A. Zisserman, Advances in neural information processing systems, pp.2017-2025, 2015.

A. Kendall, H. Martirosyan, S. Dasgupta, P. Henry, R. Kennedy et al., End-to-end learning of geometry and context for deep stereo regression, 2017.

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, 2014.

R. Mahjourian, M. Wicke, and A. Angelova, Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints, 2018.

V. Nair and G. E. Hinton, Rectified linear units improve restricted boltzmann machines, Proceedings of the 27th international conference on machine learning (ICML-10), pp.807-814, 2010.

N. Mayer, E. Ilg, P. Häusser, P. Fischer, D. Cremers et al., A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang et al., Automatic differentiation in pytorch, 2017.

C. Pinard, L. Chevalley, A. Manzanera, and D. Filliat, End-to-end depth from motion with stabilized monocular videos. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-2/W3, pp.67-74, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01587652

,

C. Pinard, L. Chevalley, A. Manzanera, and D. Filliat, Multi range Real-time depth inference from a monocular stabilized footage using a Fully Convolutional Neural Network, European Conference on Mobile Robotics. ENSTA ParisTech, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01587658

A. Ranjan, V. Jampani, K. Kim, D. Sun, J. Wulff et al., Adversarial collaboration: Joint unsupervised learning of depth, camera motion, optical flow and motion segmentation, 2018.

A. Saxena, M. Sun, and A. Y. Ng, Make3d: Learning 3d scene structure from a single still image, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.5, pp.824-840, 2009.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014.

B. Ummenhofer, H. Zhou, J. Uhrig, N. Mayer, E. Ilg et al., Demon: Depth and motion network for learning monocular stereo, IEEE Conference on Computer Vision and Pattern Recognition (CVPR, 2017.

S. Vijayanarasimhan, S. Ricco, C. Schmid, R. Sukthankar, and K. Fragkiadaki, Sfmnet: Learning of structure and motion from video, 2017.

Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE transactions on image processing, vol.13, issue.4, pp.600-612, 2004.

J. Xie, R. Girshick, and A. Farhadi, Deep3d: Fully automatic 2d-to-3d video conversion with deep convolutional neural networks, European Conference on Computer Vision, pp.842-857, 2016.

Z. Yin and J. Shi, Geonet: Unsupervised learning of dense depth, optical flow and camera pose, 2018.

J. Zbontar and Y. Lecun, Stereo matching by training a convolutional neural network to compare image patches, Journal of Machine Learning Research, vol.17, pp.1-32, 2016.

T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, Unsupervised learning of depth and ego-motion from video, 2017.