Methods to integrate a language model with semantic information for a word prediction component

Tonio Wandmacher; Jean-Yves Antoine

Communication Dans Un Congrès Année : 2007

Methods to integrate a language model with semantic information for a word prediction component

(1) , (1)

Tonio Wandmacher

Fonction : Auteur

Bases de données et traitement des langues naturelles

Jean-Yves Antoine

Fonction : Auteur
PersonId : 4673
IdHAL : jean-yves-antoine
IdRef : 137158319

Bases de données et traitement des langues naturelles

Résumé

Most current word prediction systems make use of n-gram language models (LM) to estimate the probability of the following word in a phrase. In the past years there have been many attempts to enrich such language models with further syntactic or semantic information. We want to explore the predictive powers of Latent Semantic Analysis (LSA), a method that has been shown to provide reliable information on long-distance semantic dependencies between words in a context. We present and evaluate here several methods that integrate LSA-based information with a standard language model: a semantic cache, partial reranking, and different forms of interpolation. We found that all methods show significant improvements, compared to the 4-gram baseline, and most of them to a simple cache model as well.

Mots clés

Alternative and Augmentative Communication word prediction language models semantic adaptation thematic adaptation latent semantic analysis

Domaines

Informatique et langage [cs.CL]

Jean-Yves Antoine : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00280477

Soumis le : dimanche 18 mai 2008-22:31:48

Dernière modification le : vendredi 16 février 2024-18:16:04

Dates et versions

hal-00280477 , version 1 (18-05-2008)

Identifiants

HAL Id : hal-00280477 , version 1
ARXIV : 0801.4716

Citer

Tonio Wandmacher, Jean-Yves Antoine. Methods to integrate a language model with semantic information for a word prediction component. ACL joint Conference on Empirical Methods in Natural Language Processing and Conference on Computational Natural Language Learning, EMNLP/CoNNL'2007, Jun 2007, Prague, Czech Republic. pp.506-513. ⟨hal-00280477⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TOURS CNRS LIBDTLN LIFAT INSA-GROUPE INSA-CVL

49 Consultations

0 Téléchargements

Methods to integrate a language model with semantic information for a word prediction component

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager