Rule-based reordering spaces in statistical machine translation - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Rule-based reordering spaces in statistical machine translation

Résumé

In Statistical Machine Translation (SMT), the constraints on word reorderings have a great impact on the set of potential translations that are explored. Notwithstanding computationnal issues, the reordering space of a SMT system needs to be designed with great care: if a larger search space is likely to yield better translations, it may also lead to more decoding errors, because of the added ambiguity and the interaction with the pruning strategy. In this paper, we study this trade-off using a state-of-the art translation system, where all reorderings are represented in a word lattice prior to decoding. This allows us to directly explore and compare different reordering spaces. We study in detail a rule-based preordering system, varying the length or number of rules, the tagset used, as well as contrasting with oracle settings and purely combinatorial subsets of permutations. We focus on two language pairs: English-French, a close language pair and English-German, known to be a more challenging reordering pair.

Domaines

Fichier non déposé

Dates et versions

hal-01908354 , version 1 (30-10-2018)

Identifiants

  • HAL Id : hal-01908354 , version 1

Citer

Nicolas Pécheux, Alexandre Allauzen, François Yvon. Rule-based reordering spaces in statistical machine translation. International Conference on Language Resources and Evaluation, Jan 2014, Reykjavik, Iceland. ⟨hal-01908354⟩
42 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More