Training language models without appropriate language resources : experiments with an AAC system for disabled people - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2006

Training language models without appropriate language resources : experiments with an AAC system for disabled people

Résumé

Statistical Language Models (LM) are highly dependent on their training resources. This makes it not only difficult to interpret evaluation results, it also has a deteriorating effect on the use of an LM-based application. This question has already been studied by others (e.g. Bellegarda, 2004). Considering a specific domain (text prediction in a communication aid for handicapped people) we want to address the problem from a different point of view: the influence of the language register. Considering corpora from five different registers, we want to discuss three methods to adapt a language model to its actual language resource ultimately reducing the effect of training dependency: (a) A simple cache model augmenting the probability of the n last inserted words; (b) a user dictionary, keeping every unseen word; and (c) a combined LM interpolating a base model with a dynamically updated user model. Our evaluation is based on the results obtained from a text prediction system working on a trigram LM.
Fichier non déposé

Dates et versions

hal-01023999 , version 1 (15-07-2014)

Identifiants

  • HAL Id : hal-01023999 , version 1

Citer

Tonio Wandmacher, Jean-Yves Antoine. Training language models without appropriate language resources : experiments with an AAC system for disabled people. 5th European Conference on Language Resource and Evaluation, 2006, Genes, Italy. pp.1842-1845. ⟨hal-01023999⟩
17 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More