Approche hybride de segmentation de page à base d'un descripteur de traits
Abstract
In this paper we present a new hybrid page segmentation approach based on connected component and region analysis. We first describe our stroke descriptor that detects text and line component candidates using the skeleton of the binarized document image. Then, the active contour Chan and Vese model is applied to segment the rest of the image into photo and background regions. This classification is verified by studying the variation of each detected region. Finally, we cluster the text candidates using mean-shift analysis technique according to their corresponding sizes and we present our multiscale projection profile approach to gather separately horizontal and vertical text regions. We evaluate the performances of our approach by comparing it to the existing methods that participated in ICDAR page segmentation compe- tition.