Skip to Main content Skip to Navigation
Conference papers

Unsupervised Object Discovery and Tracking in Video Collections

Suha Kwak 1, 2 Minsu Cho 1, 2 Ivan Laptev 1, 2 Jean Ponce 1, 2 Cordelia Schmid 3
2 WILLOW - Models of visual object recognition and scene understanding
CNRS - Centre National de la Recherche Scientifique : UMR8548, Inria Paris-Rocquencourt, DI-ENS - Département d'informatique de l'École normale supérieure
3 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : This paper addresses the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision. We formulate the problem as a combination of two complementary processes: discovery and tracking. The first one establishes correspondences between prominent regions across videos, and the second one associates successive similar object regions within the same video. Interestingly , our algorithm also discovers the implicit topology of frames associated with instances of the same object class across different videos, a role normally left to supervisory information in the form of class labels in conventional image and video understanding methods. Indeed, as demonstrated by our experiments, our method can handle video collections featuring multiple object classes, and substantially outperforms the state of the art in colocalization, even though it tackles a broader problem with much less supervision.
Complete list of metadatas

Cited literature [42 references]  Display  Hide  Download
Contributor : Suha Kwak <>
Submitted on : Monday, December 7, 2015 - 2:25:33 PM
Last modification on : Friday, April 17, 2020 - 11:46:03 AM
Document(s) archivé(s) le : Tuesday, March 8, 2016 - 1:34:12 PM


Files produced by the author(s)




Suha Kwak, Minsu Cho, Ivan Laptev, Jean Ponce, Cordelia Schmid. Unsupervised Object Discovery and Tracking in Video Collections. ICCV - IEEE International Conference on Computer Vision, Dec 2015, Santiago, Chile. pp.3173-3181, ⟨10.1109/ICCV.2015.363⟩. ⟨hal-01153017v2⟩



Record views


Files downloads