Statistical Post-Editing of Machine Translation for Domain Adaptation - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Statistical Post-Editing of Machine Translation for Domain Adaptation

Raphaël Rubino
  • Fonction : Auteur
  • PersonId : 776116
  • IdRef : 172390621
Stéphane Huet
Fabrice Lefèvre
Georges Linarès

Résumé

This paper presents a statistical approach to adapt out-of-domain machine translation systems to the medical domain through an unsupervised post-editing step. A statistical post-editing model is built on statistical machine translation (SMT) outputs aligned with their translation references. Evaluations carried out to translate medical texts from French to English show that an out-of-domain machine translation system can be adapted a posteri-ori to a specific domain. Two SMT systems are studied: a state-of-the-art phrase-based implementation and an online publicly available system. Our experiments also indicate that selecting sentences for post-editing leads to significant improvements of translation quality and that more gains are still possible with respect to an oracle measure.
Fichier principal
Vignette du fichier
EAMT12.pdf (213.62 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01320242 , version 1 (27-02-2019)

Identifiants

  • HAL Id : hal-01320242 , version 1

Citer

Raphaël Rubino, Stéphane Huet, Fabrice Lefèvre, Georges Linarès. Statistical Post-Editing of Machine Translation for Domain Adaptation. 16th Annual Conference of the European Association for Machine Translation (EAMT), May 2012, Trento, Italy. ⟨hal-01320242⟩

Collections

UNIV-AVIGNON LIA
73 Consultations
50 Téléchargements

Partager

Gmail Facebook X LinkedIn More