Training language models without appropriate language resources : experiments with an AAC system for disabled people

Tonio Wandmacher; Jean-Yves Antoine

Communication Dans Un Congrès Année : 2006

Training language models without appropriate language resources : experiments with an AAC system for disabled people

(1) , (1)

Tonio Wandmacher

Fonction : Auteur

Laboratoire d'Informatique Fondamentale et Appliquée de Tours

Jean-Yves Antoine

Fonction : Auteur
PersonId : 4673
IdHAL : jean-yves-antoine
IdRef : 137158319

Laboratoire d'Informatique Fondamentale et Appliquée de Tours

Résumé

Statistical Language Models (LM) are highly dependent on their training resources. This makes it not only difficult to interpret evaluation results, it also has a deteriorating effect on the use of an LM-based application. This question has already been studied by others (e.g. Bellegarda, 2004). Considering a specific domain (text prediction in a communication aid for handicapped people) we want to address the problem from a different point of view: the influence of the language register. Considering corpora from five different registers, we want to discuss three methods to adapt a language model to its actual language resource ultimately reducing the effect of training dependency: (a) A simple cache model augmenting the probability of the n last inserted words; (b) a user dictionary, keeping every unseen word; and (c) a combined LM interpolating a base model with a dynamically updated user model. Our evaluation is based on the results obtained from a text prediction system working on a trigram LM.

Domaines

Informatique et langage [cs.CL]

Denis Maurel : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01023999

Soumis le : mardi 15 juillet 2014-15:20:18

Dernière modification le : vendredi 16 février 2024-18:16:04

Dates et versions

hal-01023999 , version 1 (15-07-2014)

Identifiants

HAL Id : hal-01023999 , version 1

Citer

Tonio Wandmacher, Jean-Yves Antoine. Training language models without appropriate language resources : experiments with an AAC system for disabled people. 5th European Conference on Language Resource and Evaluation, 2006, Genes, Italy. pp.1842-1845. ⟨hal-01023999⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TOURS CNRS LIFAT INSA-GROUPE INSA-CVL

17 Consultations

0 Téléchargements

Training language models without appropriate language resources : experiments with an AAC system for disabled people

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager