Text Detection and Recognition in Urban Scenes

Abstract : Text detection and recognition in real images taken in unconstrained environments, such as street view images, remain surprisingly challenging in Computer Vision. In this paper, we present a comprehensive strategy combining bottom-up and top-down mechanisms to detect Text boxes. The bottom-up part is based on character segmentation and grouping . The top-down part is achieved with a statistical learning approach based on box descriptors. Our main contribution consists in introducing a new descriptor, Fuzzy HOG (F-HOG), fully adapted for text box analysis. A thorough experimental validation proves the efficiency of the whole system outperforming state of the art results on the standard ICDAR text detection benchmark. Another contribution concerns the exploitation of our text extraction in a complete search engine scheme. We propose to retrieve a location from a textual query: combining our text box detection technology with OCR on georeferenced street images, we achieved a GIS system with a fully automatic textual indexing. We demonstrate the relevance of our system on the real urban database of [10].
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01286100
Contributor : Lip6 Publications <>
Submitted on : Thursday, March 10, 2016 - 11:47:50 AM
Last modification on : Thursday, March 21, 2019 - 2:16:07 PM

Identifiers

Citation

Rodrigo Minetto, Nicolas Thome, Matthieu Cord, Jorge Stolfi, Frédéric Precioso, et al.. Text Detection and Recognition in Urban Scenes. International Conference on Computer Vision (ICCV): Workshop on Computer Vision for Remote Sensing of the Environment, Nov 2011, Barcelona, Spain. pp.227-234, ⟨10.1109/ICCVW.2011.6130247⟩. ⟨hal-01286100⟩

Share

Metrics

Record views

183