Morphology-Aware Alignments for Translation to and from a Synthetic Language - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Morphology-Aware Alignments for Translation to and from a Synthetic Language

Résumé

Most statistical translation models rely on the unsupervized computation of word-based alignments, which both serve to identify elementary translation units and to uncover hidden translation derivations. It is widely acknowledged that such alignments can only be reliably established for languages that share a sufficiently close notion of a word. When this is not the case, the usual method is to pre-process the data so as to balance the number of tokens on both sides of the corpus. In this paper, we propose a factored alignment model specifically designed to handle alignments involving a synthetic language (using the case of the Czech:English language pair). We show that this model can greatly reduce the number of non-aligned words on the English side, yielding more compact translation models, with little impact on the translation quality in our testing conditions.
Fichier non déposé

Dates et versions

hal-01635005 , version 1 (14-11-2017)

Identifiants

  • HAL Id : hal-01635005 , version 1

Citer

Franck Burlot, François Yvon. Morphology-Aware Alignments for Translation to and from a Synthetic Language. International Workshop on Spoken Language Translation, Jan 2015, Da Nang, Vietnam. ⟨hal-01635005⟩
50 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More