Learning to Segment Moving Objects

Pavel Tokmakov; Cordelia Schmid; Karteek Alahari

doi:10.1007/s11263-018-1122-2

Article Dans Une Revue International Journal of Computer Vision Année : 2019

Learning to Segment Moving Objects

(1, 2) , (1) , (1)

1
2

Pavel Tokmakov

Fonction : Auteur

Apprentissage de modèles à partir de données massives

The Robotics Institute

Cordelia Schmid

Fonction : Auteur
PersonId : 831154

Apprentissage de modèles à partir de données massives

Karteek Alahari

Fonction : Auteur
PersonId : 19670
IdHAL : karteek
ORCID : 0000-0002-1838-5936
IdRef : 196283892

Apprentissage de modèles à partir de données massives

Résumé

We study the problem of segmenting moving objects in unconstrained videos. Given a video, the task is to segment all the objects that exhibit independent motion in at least one frame. We formulate this as a learning problem and design our framework with three cues: (i) independent object motion between a pair of frames, which complements object recognition, (ii) object appearance, which helps to correct errors in motion estimation, and (iii) temporal consistency, which imposes additional constraints on the segmentation. The framework is a two-stream neural network with an explicit memory module. The two streams encode appearance and motion cues in a video sequence respectively , while the memory module captures the evolution of objects over time, exploiting the temporal consistency. The motion stream is a convolutional neural network trained on synthetic videos to segment independently moving objects in the optical flow field. The module to build a 'visual memory' in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. For every pixel in a frame of a test video, our approach assigns an object or background label based on the learned spatio-temporal features as well as the 'visual memory' specific to the video. We evaluate our method extensively on three benchmarks, DAVIS, Freiburg-Berkeley motion seg-mentation dataset and SegTrack. In addition, we provide an extensive ablation study to investigate both the choice of the training data and the influence of each component in the proposed framework.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

paper.pdf (5.85 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

THOTH Team : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01653720

Soumis le : mardi 25 septembre 2018-20:46:11

Dernière modification le : jeudi 4 avril 2024-21:41:51

Dates et versions

hal-01653720 , version 1 (01-12-2017)

hal-01653720 , version 2 (25-09-2018)

Identifiants

HAL Id : hal-01653720 , version 2
ARXIV : 1712.01127
DOI : 10.1007/s11263-018-1122-2

Citer

Pavel Tokmakov, Cordelia Schmid, Karteek Alahari. Learning to Segment Moving Objects. International Journal of Computer Vision, 2019, 127 (3), pp.282-301. ⟨10.1007/s11263-018-1122-2⟩. ⟨hal-01653720v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 UGA CNRS INRIA IRISA LJK LJK_GI INRIA2 LJK-GI-THOTH UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

1145 Consultations

960 Téléchargements

Learning to Segment Moving Objects

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager