SCUT-COUCH2009-TL: An Unconstrained Online Handwritten Chinese Text Lines Dataset - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

SCUT-COUCH2009-TL: An Unconstrained Online Handwritten Chinese Text Lines Dataset

Résumé

An unconstrained online handwritten Chinese text lines dataset, SCUT-COUCH2009-TL, a subset of SCUTCOUCH [1], is built to facilitate the research of unconstrained online Chinese text recognition. Texts for handcopying are sampled from China Daily corpus with a stratified random manner. The current vision of SCUTCOUCH2009- TL has 8,809 text lines (4,813 lines are collected by touch screen LCD and 3,996 by digital pen) and 159,866 characters in total that are written by more than 157 participants. To demonstrate that the dataset is practical, an over-segmentation, dynamic programming and semantic model based algorithm was presented for segmenting and recognizing the unconstrained online Chinese text lines. In preliminary experiments on the dataset, the proposed algorithm recognition achieves a baseline accuracy of 56.41%.
Fichier non déposé

Dates et versions

hal-00580597 , version 1 (28-03-2011)

Identifiants

  • HAL Id : hal-00580597 , version 1

Citer

Hanyu Yan, Lianwen Jin, Christian Viard-Gaudin, Harold Mouchère. SCUT-COUCH2009-TL: An Unconstrained Online Handwritten Chinese Text Lines Dataset. International Conference on Frontiers in Handwriting Recognition, Nov 2010, India. ⟨hal-00580597⟩
138 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More