Detect, Replace, Refine: Deep Structured Prediction For Pixel Wise Labeling

Spyros Gidaris; Nikos Komodakis

Rapport (Rapport De Recherche) Année : 2019

Detect, Replace, Refine: Deep Structured Prediction For Pixel Wise Labeling

(1, 2, 3) , (1, 2, 3)

1
2
3

Spyros Gidaris

Fonction : Auteur
PersonId : 974024

École des Ponts ParisTech

Laboratoire d'Informatique Gaspard-Monge

imagine [Marne-la-Vallée]

Nikos Komodakis

Fonction : Auteur
PersonId : 945099

École des Ponts ParisTech

Laboratoire d'Informatique Gaspard-Monge

imagine [Marne-la-Vallée]

Résumé

Pixel wise image labeling is an interesting and challenging problem with great significance in the computer vision community. In order for a dense labeling algorithm to be able to achieve accurate and precise results, it has to consider the dependencies that exist in the joint space of both the input and the output variables. An implicit approach for modeling those dependencies is by training a deep neu-ral network that, given as input an initial estimate of the output labels and the input image, it will be able to predict a new refined estimate for the labels. In this context, our work is concerned with what is the optimal architecture for performing the label improvement task. We argue that the prior approaches of either directly predicting new label estimates or predicting residual corrections w.r.t. the initial labels with feed-forward deep network architectures are sub-optimal. Instead, we propose a generic architecture that decomposes the label improvement task to three steps: 1) detecting the initial label estimates that are incorrect, 2) replacing the incorrect labels with new ones, and finally 3) refining the renewed labels by predicting residual corrections w.r.t. them. Furthermore, we explore and compare various other alternative architectures that consist of the afore-mentioned Detection, Replace, and Refine components. We extensively evaluate the examined architectures in the challenging task of dense disparity estimation (stereo matching) and we report both quantitative and qualitative results on three different datasets. Finally, our dense disparity estimation network that implements the proposed generic architecture , achieves state-of-the-art results in the KITTI 2015 test surpassing prior approaches by a significant margin. We also provide preliminary results of our approach in two semantic segmentation tasks, the Cityscapes and the ECP facade parsing tasks, and we obtain some very encouraging results.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

technical_report.pdf (9.22 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Spyros Gidaris : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01976855

Soumis le : jeudi 10 janvier 2019-12:27:07

Dernière modification le : jeudi 28 mars 2024-03:27:30

Dates et versions

hal-01976855 , version 1 (10-01-2019)

Identifiants

HAL Id : hal-01976855 , version 1

Citer

Spyros Gidaris, Nikos Komodakis. Detect, Replace, Refine: Deep Structured Prediction For Pixel Wise Labeling. [Research Report] LIGM - Laboratoire d'Informatique Gaspard-Monge; ENPC - École des Ponts ParisTech; IMAGINE [Marne-la-Vallée]. 2019. ⟨hal-01976855⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENPC CNRS LIGM_A3SI PARISTECH LIGM IMAGINE LARA UNIV-EIFFEL JSE2024

113 Consultations

66 Téléchargements

Detect, Replace, Refine: Deep Structured Prediction For Pixel Wise Labeling

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager