A. Dosovitskiy, G. Ros, F. Codevilla, A. López, and V. Koltun, CARLA: an open urban driving simulator, 1st Annual Conference on Robot Learning, pp.1-16, 2017.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems, pp.1106-1114, 2012.

E. Shelhamer, J. Long, and T. Darrell, Fully convolutional networks for semantic segmentation, IEEE Trans. Pattern Anal. Mach. Intell, vol.39, issue.4, pp.640-651, 2017.

K. He, X. Zhang, S. Ren, and J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, Computer Vision-ECCV 2014-13th European Conference, pp.346-361, 2014.

F. Yu and V. Koltun, Multi-scale context aggregation by dilated convolutions, ICLR, 2016.

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, Vision meets robotics: The KITTI dataset, I. J. Robotics Res, vol.32, issue.11, pp.1231-1237, 2013.

J. Gabriel, J. Brostow, R. Fauqueur, and . Cipolla, Semantic object classes in video: A high-definition ground truth database, Pattern Recognition Letters, vol.30, issue.2, pp.88-97, 2009.

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler et al., The cityscapes dataset for semantic urban scene understanding, 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp.3213-3223, 2016.

X. Huang, X. Cheng, Q. Geng, B. Cao, D. Zhou et al., The apolloscape dataset for autonomous driving, 2018.

P. Krähenbühl and V. Koltun, Efficient inference in fully connected crfs with gaussian edge potentials, Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems, pp.109-117, 2011.

S. Zheng, S. Jayasumana, B. Romera-paredes, V. Vineet, Z. Su et al., Conditional random fields as recurrent neural networks, 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp.1529-1537, 2015.

A. Kendall, Y. Gal, and R. Cipolla, Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, 2017.

C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, Fusenet: Incorporating depth into semantic segmentation via fusionbased CNN architecture, Computer Vision-ACCV 2016-13th

, Revised Selected Papers, Part I, Asian Conference on Computer Vision, pp.213-228, 2016.

P. Kohli, L. Ladicky, and P. H. Torr, Robust higher order potentials for enforcing label consistency, International Journal of Computer Vision, vol.82, issue.3, pp.302-324, 2009.

J. Wang and J. Kim, Semantic segmentation of urban scenes with a location prior map using lidar measurements, 2017.

, IEEE, pp.661-666, 2017.

S. Shah, D. Dey, C. Lovett, and A. Kapoor, Airsim: High-fidelity visual and physical simulation for autonomous vehicles, Field and Service Robotics, 2017.

A. Best, D. Narang, D. Barber, and . Manocha, Autonovi: Autonomous vehicle planning with dynamic maneuvers and traffic constraints, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.2629-2636, 2017.

L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, IEEE Trans. Pattern Anal. Mach. Intell, vol.40, issue.4, pp.834-848, 2018.

I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab, Deeper depth prediction with fully convolutional residual networks, Fourth International Conference on 3D Vision, vol.3, pp.239-248, 2016.

P. Diederik, J. Kingma, and . Ba, Adam: A method for stochastic optimization, 2014.