Unsupervised Object Discovery and Tracking in Video Collections

Suha Kwak 1, 2 Minsu Cho 1, 2 Ivan Laptev 1, 2 Jean Ponce 1, 2 Cordelia Schmid 3
2 WILLOW - Models of visual object recognition and scene understanding
CNRS - Centre National de la Recherche Scientifique : UMR8548, Inria Paris-Rocquencourt, DI-ENS - Département d'informatique de l'École normale supérieure
3 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : This paper addresses the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision. We formulate the problem as a combination of two complementary processes: discovery and tracking. The first one establishes correspondences between prominent regions across videos, and the second one associates successive similar object regions within the same video. Interestingly , our algorithm also discovers the implicit topology of frames associated with instances of the same object class across different videos, a role normally left to supervisory information in the form of class labels in conventional image and video understanding methods. Indeed, as demonstrated by our experiments, our method can handle video collections featuring multiple object classes, and substantially outperforms the state of the art in colocalization, even though it tackles a broader problem with much less supervision.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [42 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01153017
Contributor : Suha Kwak <>
Submitted on : Monday, December 7, 2015 - 2:25:33 PM
Last modification on : Thursday, February 7, 2019 - 2:42:22 PM
Document(s) archivé(s) le : Tuesday, March 8, 2016 - 1:34:12 PM

File

video_obj_local.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Suha Kwak, Minsu Cho, Ivan Laptev, Jean Ponce, Cordelia Schmid. Unsupervised Object Discovery and Tracking in Video Collections. ICCV - IEEE International Conference on Computer Vision, Dec 2015, Santiago, Chile. pp.3173-3181, ⟨10.1109/ICCV.2015.363⟩. ⟨hal-01153017v2⟩

Share

Metrics

Record views

904

Files downloads

498