Skip to Main content Skip to Navigation

Learning Morphological Normalization for Translation from and into Morphologically Rich Languages

Abstract : When translating between a morphologically rich language (MRL) and English, word forms in the MRL often encode grammatical information that is irrelevant with respect to English, leading to data sparsity issues. This problem can be mitigated by removing from the MRL irrelevant information through normalization. Such preprocessing is usually performed in a deterministic fashion, using hand-crafted rules and yielding suboptimal representations. We introduce here a simple way to automatically compute an appropriate normalization of the MRL and show that it can improve machine translation in both directions.
Document type :
Journal articles
Complete list of metadatas

Cited literature [27 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01618382
Contributor : Dev.Limsi Dev.Limsi <>
Submitted on : Tuesday, October 17, 2017 - 6:19:27 PM
Last modification on : Monday, February 10, 2020 - 6:14:06 PM
Document(s) archivé(s) le : Thursday, January 18, 2018 - 3:11:11 PM

File

art-burlot-yvon.pdf
Publisher files allowed on an open archive

Identifiers

Citation

Franck Burlot, François Yvon. Learning Morphological Normalization for Translation from and into Morphologically Rich Languages. The Prague Bulletin of Mathematical Linguistics, 2017, 108, pp.49-60. ⟨10.1515/pralin-2017-0008⟩. ⟨hal-01618382⟩

Share

Metrics

Record views

161

Files downloads

156