Text detection in images and videos for semantic indexing

Abstract : This work situates within the framework of image and video indexation. A way to include semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use. Existing methods for text detection are simple: most of them are based on texture estimation or edge detection followed by an accumulation of these characteristics. We suggest the usage of geometrical features very early in the detection chain: a first coarse detection calculates a text "probability" image. Afterwards, for each pixel we calculate geometrical properties of the eventual surrounding text rectangle, which are added to the features of the first step and fed into a support vector machine classifier. For the application to video sequences, we propose an algorithm which detects text on a frame by frame basis, tracking the found text rectangles across multiple frames and integrating the frame robustly into a single image. We tackle the character segmentation problem and suggest two different methods: the first algorithm maximizes a criterion based on the local contrast in the image. The second approach exploits a priori knowledge on the spatial binary distribution of the pixels. This prior knowledge in the form of a Markov random field model is integrated into Bayesian estimation framework in order to obtain an estimation of the original binary image.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Contributor : Équipe Gestionnaire Des Publications Si Liris <>
Submitted on : Monday, February 13, 2017 - 11:10:59 AM
Last modification on : Tuesday, February 26, 2019 - 4:35:36 PM


  • HAL Id : hal-01465885, version 1


Christian Wolf. Text detection in images and videos for semantic indexing. 2003. ⟨hal-01465885⟩



Record views