SCUT-COUCH2009-TL: An Unconstrained Online Handwritten Chinese Text Lines Dataset

Hanyu Yan 1 Lianwen Jin 1 Christian Viard-Gaudin 2 Harold Mouchère 2
2 irccyn-ivc
IRCCyN - Institut de Recherche en Communications et en Cybernétique de Nantes
Abstract : An unconstrained online handwritten Chinese text lines dataset, SCUT-COUCH2009-TL, a subset of SCUTCOUCH [1], is built to facilitate the research of unconstrained online Chinese text recognition. Texts for handcopying are sampled from China Daily corpus with a stratified random manner. The current vision of SCUTCOUCH2009- TL has 8,809 text lines (4,813 lines are collected by touch screen LCD and 3,996 by digital pen) and 159,866 characters in total that are written by more than 157 participants. To demonstrate that the dataset is practical, an over-segmentation, dynamic programming and semantic model based algorithm was presented for segmenting and recognizing the unconstrained online Chinese text lines. In preliminary experiments on the dataset, the proposed algorithm recognition achieves a baseline accuracy of 56.41%.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00580597
Contributor : Harold Mouchère <>
Submitted on : Monday, March 28, 2011 - 4:23:56 PM
Last modification on : Wednesday, December 19, 2018 - 3:02:08 PM

Identifiers

  • HAL Id : hal-00580597, version 1

Collections

Citation

Hanyu Yan, Lianwen Jin, Christian Viard-Gaudin, Harold Mouchère. SCUT-COUCH2009-TL: An Unconstrained Online Handwritten Chinese Text Lines Dataset. International Conference on Frontiers in Handwriting Recognition, Nov 2010, India. ⟨hal-00580597⟩

Share

Metrics

Record views

162