Handling out-of-vocabulary words and recognition errors based on word linguistic context for handwritten sentence recognition

Solen Quiniou 1, * Mohamed Cheriet 1 Eric Anquetil 2
* Auteur correspondant
2 IMADOC - Interprétation et Reconnaissance d’Images et de Documents
UR1 - Université de Rennes 1, INSA Rennes - Institut National des Sciences Appliquées - Rennes, CNRS - Centre National de la Recherche Scientifique : UMR6074
Abstract : In this paper we investigate the use of linguistic information given by language models to deal with word recognition errors on handwritten sentences. We focus especially on errors due to out-of-vocabulary (OOV) words. First, word posterior probabilities are computed and used to detect error hypotheses on output sentences. An SVM classifier allows these errors to be categorized according to defined types. Then, a post-processing step is performed using a language model based on Part-of-Speech (POS) tags which is combined to the n-gram model previously used. Thus, error hypotheses can be further recognized and POS tags can be assigned to the OOV words. Experiments on on-line handwritten sentences show that the proposed approach allows a significant reduction of the word error rate.
Type de document :
Communication dans un congrès
International Conference on Document Analysis and Recognition (ICDAR), Jul 2009, Barcelona, Spain. pp.466-470, 2009
Liste complète des métadonnées

Littérature citée [15 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00582433
Contributeur : Solen Quiniou <>
Soumis le : vendredi 1 avril 2011 - 14:34:54
Dernière modification le : lundi 14 janvier 2019 - 10:00:06
Document(s) archivé(s) le : samedi 2 juillet 2011 - 02:44:52

Fichier

quiniou09handling.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00582433, version 1

Citation

Solen Quiniou, Mohamed Cheriet, Eric Anquetil. Handling out-of-vocabulary words and recognition errors based on word linguistic context for handwritten sentence recognition. International Conference on Document Analysis and Recognition (ICDAR), Jul 2009, Barcelona, Spain. pp.466-470, 2009. 〈hal-00582433〉

Partager

Métriques

Consultations de la notice

239

Téléchargements de fichiers

295