An Efficient Parametrization of Character Degradation Model for Semi-synthetic Image Generation

Abstract : This paper presents an efficient parametrization method for generating synthetic noise on document images. By specifying the desired categories and amount of noise, the method is able to generate synthetic document images with most of degradations observed in real document images (ink splotches, white specks or streaks). Thanks to the ability of simulating different amount and kind of noise, it is possible to evaluate the robustness of many document image analysis methods. It also permits to generate data for algorithms that employ a learning process. The degradation model presented in [7] needs eight parameters for generating randomly noise regions. We propose here an extension of this model which aims to set automatically the eight parameters to generate precisely what a user wants (amount and category of noise). Our proposition consists of three steps. First, Nsp seed-points (i.e. centres of noise regions) are selected by an adaptive procedure. Then, these seed-points are classified into three categories of noise by using a heuristic rule. Finally, each size of noise region is set using a random process in order to generate degradations as realistic as possible.
Liste complète des métadonnées

Cited literature [14 references]  Display  Hide  Download
Contributor : Van Cuong Kieu <>
Submitted on : Friday, June 13, 2014 - 4:27:40 PM
Last modification on : Thursday, January 11, 2018 - 6:20:17 AM
Document(s) archivé(s) le : Saturday, September 13, 2014 - 11:30:21 AM


Files produced by the author(s)


  • HAL Id : hal-01006078, version 1



Van Cuong Kieu, Muriel Visani, Nicholas Journet, Rémy Mullot, Jean-Philippe Domenger. An Efficient Parametrization of Character Degradation Model for Semi-synthetic Image Generation. 2nd International Workshop on Historical Document Imaging and Processing, Aug 2013, Washington, DC, USA, United States. ⟨hal-01006078⟩



Record views


Files downloads