Phrase table pruning for Statistical Machine Translation - Archive ouverte HAL Accéder directement au contenu
Rapport (Rapport Technique) Année : 2009

Phrase table pruning for Statistical Machine Translation

Résumé

Phrase-Based Statistical Machine Translation systems model the translation process using pairs of corresponding sequences of words extracted from parallel corpora. These biphrases are stored in phrase tables that typically contain several millions such entries, making it difficult to assess their quality without going to the end of the translation process. Our work is based on the examplifying study of phrase tables generated from the Europarl data, from French to English. We give some statistical information about the biphrases contained in the phrase table, evaluate the coverage of previously unseen sentences and analyse the effects of pruning on the translation.
Fichier principal
Vignette du fichier
TechReport.C-2009-22.pdf (617.96 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01399371 , version 1 (18-11-2016)

Identifiants

  • HAL Id : hal-01399371 , version 1

Citer

Esther Galbrun. Phrase table pruning for Statistical Machine Translation. [Technical Report] C-2009-22, University of Helsinki. 2009, pp.38. ⟨hal-01399371⟩

Collections

LARA
32 Consultations
243 Téléchargements

Partager

Gmail Facebook X LinkedIn More