TRACK: A Multi-Modal Deep Architecture for Head Motion Prediction in 360-Degree Videos

Head motion prediction is an important problem with 360 • videos, in particular to inform the streaming decisions. Various methods tackling this problem with deep neural networks have been proposed recently. In this article, we introduce a new deep architecture, named TRACK, that benefits both from the history of past positions and knowledge of the video content. We show that TRACK achieves state-of-the-art performance when compared against all recent approaches considering the same datasets and wider prediction horizons: from 0 to 5 seconds.

Mots clés

360 videos head motion prediction deep recurrent networks content analysis

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV] Apprentissage [cs.LG] Multimédia [cs.MM] Réseau de neurones [cs.NE]

Fichier principal

ICIP2020_accepted.pdf (275.48 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Lucile Sassatelli : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02615980

Soumis le : jeudi 23 juillet 2020-19:45:11

Dernière modification le : lundi 15 avril 2024-11:25:23

Archivage à long terme le : mardi 1 décembre 2020-06:29:20

Dates et versions

hal-02615980 , version 1 (23-07-2020)

Identifiants

HAL Id : hal-02615980 , version 1

Citer

Miguel Fabian Romero Rondon, Lucile Sassatelli, Ramon Aparicio-Pardo, Frédéric Precioso. TRACK: A Multi-Modal Deep Architecture for Head Motion Prediction in 360-Degree Videos. ICIP 2020 - IEEE International Conference on Image Processing, Oct 2020, Abu Dhabi / Virtual, United Arab Emirates. ⟨hal-02615980⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA I3S DIEUDONNE INRIA2 UNIV-COTEDAZUR ANR

158 Consultations

186 Téléchargements