SmartDoc-QA: A Dataset for Quality Assessment of Smartphone Captured Document Images - Single and Multiple Distortions - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

SmartDoc-QA: A Dataset for Quality Assessment of Smartphone Captured Document Images - Single and Multiple Distortions

Résumé

Smartphones are enabling new ways of capture, hence arises the need for seamless and reliable acquisition and digitization of documents. The quality assessment step is an important part of both the acquisition and the digitization processes. Assessing document quality could aid users during the capture process or help improve image enhancement methods after a document has been captured. Current state-of-the-art works lack databases in the field of document image quality assessment. In order to provide a baseline benchmark for quality assessment methods for mobile captured documents, we present in this paper a dataset for quality assessment that contains both singly- and multiply-distorted document images. The proposed dataset could be used for benchmarking quality assessment methods by the objective measure of OCR accuracy, and could be also used to benchmark quality enhancement methods. There are three types of documents in the dataset: modern documents, old administrative letters and receipts. The document images of the dataset are captured under varying capture conditions (light, different types of blur and perspective angles). This causes geometric and photometric distortions that hinder the OCR process. The ground truth of the dataset images consists of the text transcriptions of the documents, the OCR results of the captured documents and the values of the different capture parameters used for each image. We also present how the dataset could be used for evaluation in the field of no-reference quality assessment. The dataset is freely and publicly available for use by the research community at http://navidomass.univ-lr.fr/SmartDoc-QA.
Fichier non déposé

Dates et versions

hal-01319900 , version 1 (23-05-2016)

Identifiants

  • HAL Id : hal-01319900 , version 1

Citer

Nibal Nayef, Muhammad Muzzamil Luqman, Sophea Prum Prum, Sébastien Eskenazi, Joseph Chazalon, et al.. SmartDoc-QA: A Dataset for Quality Assessment of Smartphone Captured Document Images - Single and Multiple Distortions. International Workshop on Camera Based Document Analysis and Recognition (CBDAR 2015), Aug 2014, Nancy, France. pp.1231 - 1235. ⟨hal-01319900⟩

Collections

L3I UNIV-ROCHELLE
393 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More