Forecasting of Depth and Ego-Motion with Transformers and Self-supervision - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

Forecasting of Depth and Ego-Motion with Transformers and Self-supervision

Résumé

This paper addresses the problem of end-to-end self-supervised forecasting of depth and ego motion. Given a sequence of raw images, the aim is to forecast both the geometry and ego-motion using a self supervised photometric loss. The architecture is designed using both convolution and transformer modules. This leverages the benefits of both modules: Inductive bias of CNN, and the multi-head attention of transformers, thus enabling a rich spatio-temporal representation that enables accurate depth forecasting. Prior work attempts to solve this problem using multi-modal input/output with supervised ground-truth data which is not practical since a large annotated dataset is required. Alternatively to prior methods, this paper forecasts depth and ego motion using only self-supervised raw images as input. The approach performs significantly well on the KITTI dataset benchmark with several performance criteria being even comparable to prior non-forecasting self-supervised monocular depth estimation methods.
Fichier non déposé

Dates et versions

hal-03655106 , version 1 (29-04-2022)

Identifiants

  • HAL Id : hal-03655106 , version 1

Citer

Houssem Eddine Boulahbal, Adrian Voicila, Andrew I. Comport. Forecasting of Depth and Ego-Motion with Transformers and Self-supervision. 26TH International Conference on Pattern Recognition, Aug 2022, Montreal, Canada. ⟨hal-03655106⟩
72 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More