Improving Machine Translation of Arabic Dialects through Multi-Task Learning - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Improving Machine Translation of Arabic Dialects through Multi-Task Learning

Résumé

Neural Machine Translation (NMT) systems have been shown to perform impressively on many language pairs compared to Statistical Machine Translation (SMT). However, these systems are data-intensive, which is problematic for the majority of language pairs, and especially for low-resource languages. In this work, we address this issue in the case of certain Arabic dialects, those variants of Modern Standard Arabic (MSA) that are spelling non-standard, morphologically rich, and yet resource-poor variants. Here, we have experimented with several multitasking learning strategies to take advantage of the relationships between these dialects. Despite the simplicity of this idea, empirical results show that several multitasking learning strategies are capable of achieving remarkable performance compared to statistical machine translation. For instance, we obtained the BLUE scores for the Algerian → Modern-Standard-Arabic and the Moroccan → Palestinian of 35.06 and 27.55, respectively, while the scores obtained with a statistical method are 15.1 and 18.91 respectively. We show that on 42 machine translation experiments, and despite the use of a small corpus, multitasking learning achieves better performance than statistical machine translation in 88% of cases.
Fichier principal
Vignette du fichier
Springer_Lecture_Notes_in_Computer_Science__4_.pdf (594.01 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03435996 , version 1 (19-11-2021)

Identifiants

  • HAL Id : hal-03435996 , version 1

Citer

Youness Moukafih, Nada Sbihi, Mounir Ghogho, Kamel Smaïli. Improving Machine Translation of Arabic Dialects through Multi-Task Learning. 20th International Conference Italian Association for Artificial Intelligence:AIxIA 2021, Dec 2021, MILAN/Virtual, Italy. ⟨hal-03435996⟩
142 Consultations
262 Téléchargements

Partager

Gmail Facebook X LinkedIn More