Translation of sublanguages by subgrammars
Résumé
This paper discusses the performance of two data-driven translation methods for the translation of a very constrained sublanguage: dates. As a first result, we show that an example-based method is outperformed by a statistical method for the translation of dates from Chinese into English when small random training corpora are used: 750 random examples suffice to translate almost perfectly a corpus of 4,018 dates for both methods. As a second result, we prove that 58 dates theoretically suffice to translate the same corpus of 4,018 dates perfectly and we verify this fact experimentally with an example based method, while a statistical method fails at translating 345 dates in the 4,018 dates to translate.
Origine : Fichiers produits par l'(les) auteur(s)