Vers une démocratisation des outils de constitution de corpus parallèles

Abstract : Machine translation (MT) has won its place in the world of translation: MT-related contents (such as post-editing) are now a fixture in the translation curriculum, and in professional settings MT is accessed through plug- ins within CAT environments. Nonetheless, MT engine customisation – a crucially important task for an MT system’s performance – remains too often out of translators’ reach. Indeed, bilingual corpora are rarely available, and often ill- suited to the task (few are domain-specific). Moreover, the tools available to (trainee) translators for building training corpora are still too complex for them to use. Our work aims at democratising such tools. As part of a hands-on activity, we set out to simplify the parallel corpus building process, by assembling a ‘toolbox’ which handles the process as a sequence of easier-to-handle tasks. Further automation of the process is possible.
Complete list of metadatas

Cited literature [3 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01347090
Contributor : Fabienne Moreau <>
Submitted on : Wednesday, July 20, 2016 - 12:17:17 PM
Last modification on : Wednesday, May 16, 2018 - 11:23:01 AM

File

moreau_efraim_rennes2_corpus-p...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01347090, version 1

Collections

Citation

Octavia Efraim, Fabienne Moreau. Vers une démocratisation des outils de constitution de corpus parallèles. Conférence TAO-CAT 2015, Jun 2015, Angers, France. ⟨hal-01347090⟩

Share

Metrics

Record views

88

Files downloads

199