Arabizi transliteration of Algerian Arabic dialect into Modern Standard Arabic
Résumé
Machine transliteration is a very important research area in the field of machine translation. Neural Machine transliteration (NMTR) is a new approach to machine transliteration that has shown promising results. However research on NMTR of Arabic has just begun to give results while no research has been done on neural transliteration of Arabic dialect written in Latin letters known by “Arabizi”.
In the current paper, we propose a me-thod of applying a neural transliteration based on a character-level for transliterating the Arabizi to Arabic script. Our method is composed of two important steps: 1) An Arabizi corpus construction 2) A character-based neural transliteration of Arabizi to Arabic.
The evaluations were performed on in-ternal and external dataset. The best precision obtained is 73.66% on the internal dataset and 45.35% on the external one. We also conduct the same experiments for Statistical Machine Transliteration (SMTR), which has largely been studied in the literature, albeit NMTR obtains substantial improvements (up to 2.18%) over SMTR.