UNSUPERVISED HANDWRITTEN GRAPHICAL SYMBOL LEARNING Using Minimum Description Length Principle on Relational Graph

Abstract : Generally, the approaches encountered in the field of handwriting recognition require the knowledge of the symbol set, and of as many as possible ground-truthed samples, so that machine learning based approaches can be implemented. In this work, we propose the discovery of the symbol set that is used in the context of a graphical language produced by on-line handwriting. We consider the case of a two-dimensional graphical language such as mathematical expression composition, where not only left to right layouts have to be considered. Firstly, we select relevant graphemes using hierarchical clustering. Secondly, we build a relational graph between the strokes defining an handwritten expression. Thirdly, we extract the lexicon which is a set of graph substructures using the minimum description length principle. For the assessment of the extracted lexicon, a hierarchical segmentation task is introduced. From the experiments we conducted, a recall rate of 84.2% is reported on the test part of our database produced by 100 writers.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00615217
Contributor : Harold Mouchère <>
Submitted on : Thursday, August 18, 2011 - 12:01:30 PM
Last modification on : Wednesday, December 19, 2018 - 3:02:03 PM

Identifiers

  • HAL Id : hal-00615217, version 1

Collections

Citation

Jinpeng Li, Harold Mouchère, Christian Viard-Gaudin. UNSUPERVISED HANDWRITTEN GRAPHICAL SYMBOL LEARNING Using Minimum Description Length Principle on Relational Graph. International Conference on Knowledge Discovery and Information Retrieval, KDIR 2011, Oct 2011, Paris, France. ⟨hal-00615217⟩

Share

Metrics

Record views

196