Weakly Supervised Learning for Visual Recognition

Thibaut Durand 1
1 MLIA - Machine Learning and Information Access
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : This thesis studies the problem of classification of images, where the goal is to predict if a semantic category is present in the image, based on its visual content. To analyze complex scenes, it is important to learn localized representations. To limit the cost of annotation during training, we have focused on weakly supervised learning approaches. In this thesis, we propose several models that simultaneously classify and localize objects, using only global labels during training. The weak supervision significantly reduces the cost of full annotation, but it makes learning more challenging. The key issue is how to aggregate local scores - e.g. regions - into global score - e.g. image. The main contribution of this thesis is the design of new pooling functions for weakly supervised learning. In particular, we propose a “max + min” pooling function, which unifies many pooling functions. We describe how to use this pooling in the Latent Structured SVM framework as well as in convolutional networks. To solve the optimization problems, we present several solvers, some of which allow to optimize a ranking metric such as Average Precision. We experimentally show the interest of our models with respect to state-of-the-art methods, on ten standard image classification datasets, including the large-scale dataset ImageNet.
Liste complète des métadonnées

Cited literature [199 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/tel-01667325
Contributor : Thibaut Durand <>
Submitted on : Tuesday, December 19, 2017 - 11:48:33 AM
Last modification on : Thursday, March 21, 2019 - 1:20:38 PM

File

thesis.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-01667325, version 1

Citation

Thibaut Durand. Weakly Supervised Learning for Visual Recognition. Computer Vision and Pattern Recognition [cs.CV]. Université Pierre et Marie Curie, 2017. English. ⟨tel-01667325⟩

Share

Metrics

Record views

325

Files downloads

253