Temporal Difference Rewards for End-to-end Vision-based Active Robot Tracking using Deep Reinforcement Learning - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Temporal Difference Rewards for End-to-end Vision-based Active Robot Tracking using Deep Reinforcement Learning

Résumé

Object tracking allows for localizing moving objects in sequences of frames providing detailed information regarding the trajectory of objects that appear in a scene. In this paper, we study active object tracking, where a tracker receives an input visual observation and directly outputs the most appropriate control actions in order to follow and keep the target in its field of view, unifying in this way the task of visual tracking and control. This is in contrast with conventional tracking approaches, as typically developed by the computer vision community, where the problem of detecting the tracked object in a frame is decoupled from the problem of controlling the camera and/or the robot to follow the object. Deep Reinforcement Learning (DLR) methods hold the credentials for overcoming these issues, since they allow for tackling both problems, i.e., detecting the tracked object and providing control commands, at the same time. However, DRL algorithms require a significantly different methodology for training compared to traditional computer vision models, e.g., they rely on dynamic simulations for training instead of static datasets, while they are often notoriously difficult to converge, often requiring reward shaping approaches for increasing convergence speed and stability. The main contribution of this paper is a DRL, vision-based active tracking method, along with an appropriately designed reward shaping approach for active tracking problems. The developed methods are evaluated using a state-of-the-art robotics simulator, demonstrating good generalization on various dynamic trajectories of moving objects under a wide range of different setups.
Fichier principal
Vignette du fichier
m26787-tiritiris.pdf (333.04 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03281187 , version 1 (08-07-2021)

Identifiants

  • HAL Id : hal-03281187 , version 1

Citer

Pavlos Tiritiris, Nikolaos Passalis, Anastasios Tefas. Temporal Difference Rewards for End-to-end Vision-based Active Robot Tracking using Deep Reinforcement Learning. International Conference on Emerging Techniques in Computational Intelligence, ICETCI 2021, 2021, Virtual, India. ⟨hal-03281187⟩
20 Consultations
82 Téléchargements

Partager

Gmail Facebook X LinkedIn More