HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Modeling Visual Context is Key to Augmenting Object Detection Datasets

Nikita Dvornik 1 Julien Mairal 1 Cordelia Schmid 1
1 Thoth - Apprentissage de modèles à partir de données massives
LJK - Laboratoire Jean Kuntzmann , Inria Grenoble - Rhône-Alpes
Abstract : Performing data augmentation for learning deep neural networks is well known to be important for training visual recognition systems. By artificially increasing the number of training examples, it helps reducing overfitting and improves generalization. For object detection, classical approaches for data augmentation consist of generating images obtained by basic geometrical transformations and color changes of original training images. In this work, we go one step further and leverage segmentation annotations to increase the number of object instances present on training data. For this approach to be successful, we show that modeling appropriately the visual context surrounding objects is crucial to place them in the right environment. Otherwise, we show that the previous strategy actually hurts. With our context model, we achieve significant mean average precision improvements when few labeled examples are available on the VOC’12 benchmark.
Document type :
Conference papers
Complete list of metadata

Contributor : Nikita Dvornik Connect in order to contact the contributor
Submitted on : Thursday, July 19, 2018 - 1:45:40 PM
Last modification on : Friday, February 4, 2022 - 3:12:17 AM
Long-term archiving on: : Saturday, October 20, 2018 - 3:52:44 PM


Files produced by the author(s)




Nikita Dvornik, Julien Mairal, Cordelia Schmid. Modeling Visual Context is Key to Augmenting Object Detection Datasets. ECCV 2018 - European Conference on Computer Vision, Sep 2018, Munich, Germany. pp.375-391, ⟨10.1007/978-3-030-01258-8_23⟩. ⟨hal-01844474⟩



Record views


Files downloads